-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Source: LongestCommonSubsequence
Target: ILP
Motivation: Enables solving LCS instances via ILP solvers; the match-pair formulation naturally encodes subsequence ordering as linear constraints.
Reference: Lancia et al., 2001; general ILP formulations for sequence alignment in Althaus et al., 2006
Reduction Algorithm
Notation:
- Source:
$k = 2$ strings$s_1$ (length$n_1$ ) and$s_2$ (length$n_2$ ) over alphabet$\Sigma$ - Target: ILP with binary variables, linear constraints, and a maximize objective
For general
Variable mapping:
For each pair of positions
meaning "position
Total variables:
Constraints:
-
Each position in
$s_1$ matched at most once: for each$j_1 \in {0, \ldots, n_1-1}$ ,
$$\sum_{j_2 : (j_1,j_2) \in M} m_{j_1, j_2} \leq 1$$ -
Each position in
$s_2$ matched at most once: for each$j_2 \in {0, \ldots, n_2-1}$ ,
$$\sum_{j_1 : (j_1,j_2) \in M} m_{j_1, j_2} \leq 1$$ -
Order preservation (no crossings): for all
$(j_1, j_2), (j_1', j_2') \in M$ with$j_1 < j_1'$ and$j_2 > j_2'$ ,
$$m_{j_1, j_2} + m_{j_1', j_2'} \leq 1$$
Objective: maximize
Solution extraction: The matched positions with
Size Overhead
| Target metric (code name) | Polynomial (using symbols above) |
|---|---|
num_vars |
|
num_constraints |
$\leq n_1 + n_2 + \binom{ |
Validation Method
Closed-loop testing: solve LCS by brute-force, solve the reduced ILP, and verify that both give the same optimal objective. Test with varying alphabet sizes and string lengths, including edge cases (identical strings, no common characters, single-character alphabet).
Example
Source instance: ABAC, BACA, alphabet
Variables (6 match pairs):
| Variable |
|
|
Character |
|---|---|---|---|
| 0 | 1 | A | |
| 0 | 3 | A | |
| 1 | 0 | B | |
| 2 | 1 | A | |
| 2 | 3 | A | |
| 3 | 2 | C |
Constraints (13 total):
Assignment for
Assignment for
No-crossing:
Objective: maximize
Optimal ILP solution: objective = 3, with
Extracted LCS: