Skip to content

Educational POSIX threads project showcasing producer-consumer coordination, per-item locking, and condition-variable signaling across multiple thread pools for deterministic file rewriting.

License

Notifications You must be signed in to change notification settings

Nuraddin0/Multithreaded-Text-Transformer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multithreaded Text Transformer

A high-performance multithreaded C application that transforms text files using a pipeline architecture with four types of worker threads: readers, uppercase converters, space replacers, and writers.

Overview

This project demonstrates advanced multithreading concepts in C using POSIX threads (pthreads). It processes a text file by reading lines, converting them to uppercase, replacing spaces with underscores, and writing the transformed content back to the file - all done concurrently with configurable numbers of worker threads for each stage.

Features

  • Multithreaded Pipeline Architecture: Four distinct processing stages working in parallel
  • Thread Synchronization: Uses mutexes and condition variables for safe concurrent access
  • Configurable Thread Pools: Customize the number of threads for each processing stage
  • In-Place File Modification: Writes transformed content back to the original file
  • Detailed Logging: Each thread prints its operations for debugging and monitoring

Processing Pipeline

  1. Reader Threads: Read lines from the input file
  2. Upper Threads: Convert text to uppercase
  3. Replace Threads: Replace spaces with underscores
  4. Writer Threads: Write transformed lines back to the file

Prerequisites

  • GCC compiler
  • POSIX-compliant system (Linux, macOS, Unix)
  • pthread library

Compilation

gcc -pthread main.c -o main

Or use the included build output:

gcc -fdiagnostics-color=always -g main.c -o main

Usage

./main -d <filename> -n <nRead> <nUpper> <nReplace> <nWrite>

Parameters

  • -d <filename>: Path to the input text file
  • -n: Flag indicating thread count parameters follow
  • <nRead>: Number of reader threads
  • <nUpper>: Number of uppercase converter threads
  • <nReplace>: Number of space replacer threads
  • <nWrite>: Number of writer threads

Example

./main -d test.txt -n 2 3 3 2

This command processes test.txt using:

  • 2 reader threads
  • 3 uppercase converter threads
  • 3 space replacer threads
  • 2 writer threads

Test File Generation

A Python script is included to generate test files with random text:

python3 fileGenerate.py

This creates test.txt with 950 lines of random words (5-15 words per line, 3-10 characters per word).

How It Works

Thread Synchronization

The application uses several synchronization primitives:

  • Mutexes:

    • fileLockRead: Protects file read operations
    • fileLockWrite: Protects file write operations
    • upperJobSelect: Coordinates uppercase thread job assignment
    • replaceJobSelect: Coordinates replace thread job assignment
    • writeJobSelect: Coordinates writer thread job assignment
    • Per-line mutexes: Protect individual line data structures
  • Condition Variables:

    • condUpperReady: Signals when data is ready for uppercase conversion
    • condReplaceReady: Signals when data is ready for space replacement
    • condWriteReady: Signals when data is ready for writing

Data Flow

  1. Reader threads read lines sequentially and store them in a shared array
  2. Upper and Replace threads process lines independently (can happen in any order)
  3. Writer threads wait for both transformations to complete before writing
  4. All threads coordinate using condition variables to ensure data dependencies are met

Configuration Limits

  • MAX_LINE_LENGTH: 1024 characters per line
  • MAX_LINES: 1000 lines maximum

These can be modified in the source code if needed.

Example Output

Read_1 read the line 1 which is "hello world example text"
Upper_1 read index 0 and converted "hello world example text" to "HELLO WORLD EXAMPLE TEXT"
Replace_1 read index 0 and converted "HELLO WORLD EXAMPLE TEXT" to "HELLO_WORLD_EXAMPLE_TEXT"
Writer_1 write line 1 back which is "HELLO_WORLD_EXAMPLE_TEXT"

Performance Considerations

  • Increasing thread counts can improve performance on multi-core systems
  • Too many threads may cause contention and reduce performance
  • Optimal thread counts depend on your system's CPU core count and I/O capabilities
  • The pipeline architecture allows different stages to work in parallel

License

MIT License - This project is free to use for educational and commercial purposes.

See the LICENSE file for details.

Author

Created by Nuraddin0, isasimsekk.

This project demonstrates advanced multithreaded programming concepts in C using POSIX threads.

About

Educational POSIX threads project showcasing producer-consumer coordination, per-item locking, and condition-variable signaling across multiple thread pools for deterministic file rewriting.

Topics

Resources

License

Stars

Watchers

Forks