A high-performance multithreaded C application that transforms text files using a pipeline architecture with four types of worker threads: readers, uppercase converters, space replacers, and writers.
This project demonstrates advanced multithreading concepts in C using POSIX threads (pthreads). It processes a text file by reading lines, converting them to uppercase, replacing spaces with underscores, and writing the transformed content back to the file - all done concurrently with configurable numbers of worker threads for each stage.
- Multithreaded Pipeline Architecture: Four distinct processing stages working in parallel
- Thread Synchronization: Uses mutexes and condition variables for safe concurrent access
- Configurable Thread Pools: Customize the number of threads for each processing stage
- In-Place File Modification: Writes transformed content back to the original file
- Detailed Logging: Each thread prints its operations for debugging and monitoring
- Reader Threads: Read lines from the input file
- Upper Threads: Convert text to uppercase
- Replace Threads: Replace spaces with underscores
- Writer Threads: Write transformed lines back to the file
- GCC compiler
- POSIX-compliant system (Linux, macOS, Unix)
- pthread library
gcc -pthread main.c -o mainOr use the included build output:
gcc -fdiagnostics-color=always -g main.c -o main./main -d <filename> -n <nRead> <nUpper> <nReplace> <nWrite>-d <filename>: Path to the input text file-n: Flag indicating thread count parameters follow<nRead>: Number of reader threads<nUpper>: Number of uppercase converter threads<nReplace>: Number of space replacer threads<nWrite>: Number of writer threads
./main -d test.txt -n 2 3 3 2This command processes test.txt using:
- 2 reader threads
- 3 uppercase converter threads
- 3 space replacer threads
- 2 writer threads
A Python script is included to generate test files with random text:
python3 fileGenerate.pyThis creates test.txt with 950 lines of random words (5-15 words per line, 3-10 characters per word).
The application uses several synchronization primitives:
-
Mutexes:
fileLockRead: Protects file read operationsfileLockWrite: Protects file write operationsupperJobSelect: Coordinates uppercase thread job assignmentreplaceJobSelect: Coordinates replace thread job assignmentwriteJobSelect: Coordinates writer thread job assignment- Per-line mutexes: Protect individual line data structures
-
Condition Variables:
condUpperReady: Signals when data is ready for uppercase conversioncondReplaceReady: Signals when data is ready for space replacementcondWriteReady: Signals when data is ready for writing
- Reader threads read lines sequentially and store them in a shared array
- Upper and Replace threads process lines independently (can happen in any order)
- Writer threads wait for both transformations to complete before writing
- All threads coordinate using condition variables to ensure data dependencies are met
MAX_LINE_LENGTH: 1024 characters per lineMAX_LINES: 1000 lines maximum
These can be modified in the source code if needed.
Read_1 read the line 1 which is "hello world example text"
Upper_1 read index 0 and converted "hello world example text" to "HELLO WORLD EXAMPLE TEXT"
Replace_1 read index 0 and converted "HELLO WORLD EXAMPLE TEXT" to "HELLO_WORLD_EXAMPLE_TEXT"
Writer_1 write line 1 back which is "HELLO_WORLD_EXAMPLE_TEXT"
- Increasing thread counts can improve performance on multi-core systems
- Too many threads may cause contention and reduce performance
- Optimal thread counts depend on your system's CPU core count and I/O capabilities
- The pipeline architecture allows different stages to work in parallel
MIT License - This project is free to use for educational and commercial purposes.
See the LICENSE file for details.
Created by Nuraddin0, isasimsekk.
This project demonstrates advanced multithreaded programming concepts in C using POSIX threads.