### Table 6 : Parsimonious model

### Table 3 Haplotype analysis

"... In PAGE 5: ...05 and eight SNPs a P-value lt; 0.01 ( Table3 ). From each region we selected three SNPs (or two when three could not be found) with the most significant P-values for haplotype analysis since, in some situations, haplotype analysis is statistically more powerful.... ..."

### Table 1: Phylogenetic parsimony criteria.

"... In PAGE 31: ... The hypothesis encoded in this tree is preferred because it explains as much of the observed character distributions as possible by character-state transitions in a common ancestor, and invokes the fewest ad hoc hypotheses of subsequent character-state change [Far83]. There are several phylogenetic parsimony criteria, each of which encodes a di erent model of evolution by placing di erent restrictions on the types and numbers of character-state transitions allowable in a tree (see Table1 ). The Wagner Linear [KF69], Wagner General, and Fitch [Fit71] criteria assume the simplest model of evolution, in which character-state change is reversible.... In PAGE 33: ....2.1 Phylogenetic Parsimony Each of these problems is given as input a discrete character matrix for m taxa and d char- acters, and operates on an implicit graph G whose vertices are the set of all d-dimensional points de ned by the states of the given characters and whose edges are speci ed by the allow- able transitions between the states in these characters. Each phylogenetic parsimony problem seeks the evolutionary tree in G of minimum length that includes the given taxa, subject to the restrictions on character-state transitions that are particular to that problem apos;s criterion (see Table1 ). The given characters can be restricted in various ways to generate a family of... In PAGE 43: ... Question: Does the collection of characters C have a polarization such that there is a com- patible collection C0 C such that jC0j B? Unconstrained Qualitative Compatibility (UQC) Instance: Collection C of d qualitative characters de ned on a set of m objects; a positive integer B d. Question: Does the collection of characters C have a polarization and an ordering such that there is a compatible collection C0 C such that jC0j B? Table1 0: Character compatibility decision problems (adapted from [DS86]).... In PAGE 44: ... B0 = B BQC p m BCC [DS86] d0 = d m0 = m X0 = [x0i;j]; 1 i d0; 1 j m0 where a character apos;s most frequently occurring state becomes that charac- ter apos;s ancestral state in X0. B0 = B Table1 1: Reductions for character compatibility decision problems.... In PAGE 45: ... Question: Does there exist an additive tree T 2 Ad n such that X(D; A(T)) B? Fitting Unconstrained Matrices to Graph-Based Dominant Additive Trees (FUGT[ ]) Instance: Complete graph G = (V; E), jV j = n; semimetric D 2 Mn de ned on all pairs of vertices in G; set of taxa S V ; and a positive integer B. Question: Is there a subtree T of G that includes S such that Pfx;yg2T D(x; y) B and [ A(T)]S DS? Table1 2: Distance matrix tting decision problems (adapted from [Day83, KM86, Day87, Kri88]).... In PAGE 47: ... Question: Does there exist an ultrametric tree U 2 Un;2 such that X(D; U(U)) B? Fitting Binary Matrices to Dominant Ultrametric Trees of Height 2 VIA STATISTIC X (FBUT2[X, ]) [X 2 fF1; F2g] Instance: Set S of n taxa; semimetric D 2 Bn; and a positive integer B. Question: Does there exist an ultrametric tree U 2 Un;2 such that X(D; U(U)) B and U(U) D? Table1 3: Auxiliary decision problems for NP-hardness proofs of distance matrix tting decision problems (adapted from [KM86, Day87, Kri88]). .... In PAGE 48: ... S0 = S + yi; 1 i apos; D0 = [d0 i;j] = quot; D M M0 1 #, where M = [mi;j], mi;j = for all 1 i n and 1 j apos;, M0 is the transpose of M, and 1 is a square matrix with zeros on, but ones o , the main diagonal. B0 = B VC p m FUGT[ ] V = f g S f vi j 1 i jVVCjg S f ej j 1 j jEV Cjg D = [di;j], where d( ; vi) = 1 d( ; ej) = 2 d(vi; vj) = 4 d(vi; ej) = 1 if ej = fvi; xg 2 EV C, d(vi; ej) = 3 otherwise d(ei; ej) = 2 S = f g S f ej j 1 j jEVCjg B = K + jEV Cj Table1 4: Reductions for distance matrix tting decision problems.... In PAGE 52: ...Thesis Literature Phylogenetic UBfC,QgCS fC,QgCS [DJS86] Parsimony UBfC,QgDo fC,QgDO [DJS86] UBfC,QgCI fC,QgCI [DJS86] UBW SPQ [GF82, Day83] UUW SPP [GF82, Day83] WUOWL WTP [Day83] Character BfQ,CgC BfQ,CgC [DS86] Compatibility UfQ,CgC UfQ,CgC [DS86] Distance Matrix FBUT[F1] bHICy [KM86] Fitting FBUT2[F1] bHIC3y [KM86], 2 1y [Kri86], FUT[1] [Day87] FBUT2[F2] 2 2y [Kri86], FUT[2] [Day87] FUUT[F1] 1y [Kri86], HICy [KM86] FUUT[F2] 2y [Kri86] FUUT[F1; ] P4 [Kri88] FUDT[ ], 2 fF1; F2g FAT[ ], 2 f1; 2g [Day87] FUGT[ ] AET [Day83] Table1 5: Correspondence between phylogenetic inference problems in this thesis and problems in the literature. All solution problems are marked with daggers (y); all other problems are decision problems.... In PAGE 74: ...Unweighted Weighted Given-Cost Given-Limit Decision - NP-complete Evaluation FPNP[O(logn)]-C FPNP jj -hard y - Solution FPNP jj -hard, properly FPNP[O(logn)]-hard, 2 NPMVg FPNP 2 NPMVg Spanning 2 Span(NPMVg FPNP ) 2 SpanP Enumeration 2 FP#P Random 2 FRP p 2 Generation Table1 6: Computational complexities of phylogenetic inference functions. y Most weighted distance matrix tting evaluation problems are only known to be properly FPNP[O(logn)]-hard (see Corollary 26).... In PAGE 78: ... Formula: maxT jfcj9x[(P(c; x) ^ x 2 T) _ (N(c; x) ^ :(x 2 T))]gj where P, N, and T are as de ned for SAT. Table1 7: Formulations of SAT in rst-order logic (adapted from [KT90, PY91]). MAX NP.... In PAGE 79: ... Formula:maxT jf(x1; x2; x3)j [ ((x1; x2; x3) 2 C0 ! x1 2 T _ x2 2 T _ x3 2 T) ^ ((x1; x2; x3) 2 C1 ! x1 62 T _ x2 2 T _ x3 2 T) ^ ((x1; x2; x3) 2 C2 ! x1 62 T _ x2 62 T _ x3 2 T) ^ ((x1; x2; x3) 2 C3 ! x1 62 T _ x2 62 T _ x3 62 T) ] gj; where C0, C1, C2, C3, and T are as de ned for 3SAT. Table1 8: Formulations of SAT in rst-order logic (cont apos;d from Table 17).... In PAGE 82: ... 4. Given a solution W of cost c to an instance of SOL-MIN-FBUT2[F1] derived by the reduction from X3C given in [KM86] (See Table1 4), in polynomial time we can nd a canonical solution W0 with cost c0 c. 5.... In PAGE 82: ... 5. Given a solution W of cost c to an instance of SOL-MIN-FUDT[F ] ( 2 f1; 2g) derived by the reductions from FBUT2[ ] given in [Day87] (see Table1 4), in polynomial time we can nd a canonical solution W0 of cost c0 c.... In PAGE 84: ... As the Generalized parsimony criterion can simulate any ordered phylogenetic parsimony problem, (5) can be proved by a variant on any of the proofs for (1 - 4). Proofs of (6 { 7): By the reductions given in Table1 1, solutions to SOL-MAX-BCC (SOL- MAX-BQC) yield solutions to SOL-MAX-CLIQUE (SOL-MAX-BCC) of the same cost. Hence, these reductions yield L-reductions with = = 1.... In PAGE 84: ... Hence, this reduction is an L-reduction. Proof of (10): Consider the reduction from FBUT2[F ] to FUDT[F ] 2 f1; 2g given in [Day87] (see Table1 4). As OPTFBUT2[F ] = OPTFUDT[F ], condition (L1) is satis ed with = 1.... ..."

### Table 1: Phylogenetic parsimony criteria.

"... In PAGE 26: ... The hypothesis encoded in this tree is preferred because it explains as much of the observed character distributions as possible by character-state transitions in a common ancestor, and invokes the fewest ad hoc hypotheses of subsequent character-state change [Far83]. There are several phylogenetic parsimony criteria, each of which encodes a di erent model of evolution by placing di erent restrictions on the types and numbers of character-state transitions allowable in a tree (see Table1 ). The Wagner Linear [KF69], Wagner General, and Fitch [Fit71] criteria assume the simplest model of evolution, in which character-state change is reversible.... In PAGE 28: ....2.1 Phylogenetic Parsimony Each of these problems is given as input a discrete character matrix for m taxa and d char- acters, and operates on an implicit graph G whose vertices are the set of all d-dimensional points de ned by the states of the given characters and whose edges are speci ed by the allow- able transitions between the states in these characters. Each phylogenetic parsimony problem seeks the evolutionary tree in G of minimum length that includes the given taxa, subject to the restrictions on character-state transitions that are particular to that problem apos;s criterion (see Table1 ). The given characters can be restricted in various ways to generate a family of... In PAGE 38: ... Question: Does the collection of characters C have a polarization such that there is a com- patible collection C0 C such that jC0j B? Unconstrained Qualitative Compatibility (UQC) Instance: Collection C of d qualitative characters de ned on a set of m objects; a positive integer B d. Question: Does the collection of characters C have a polarization and an ordering such that there is a compatible collection C0 C such that jC0j B? Table1 0: Character compatibility decision problems (adapted from [DS86]).... In PAGE 39: ... B0 = B BQC p m BCC [DS86] d0 = d m0 = m X0 = [x0i;j]; 1 i d0; 1 j m0 where a character apos;s most frequently occurring state becomes that charac- ter apos;s ancestral state in X0. B0 = B Table1 1: Reductions for character compatibility decision problems.... In PAGE 40: ... Question: Does there exist an additive tree T 2 Ad n such that X(D; A(T)) B? Fitting Unconstrained Matrices to Graph-Based Dominant Additive Trees (FUGT[ ]) Instance: Complete graph G = (V; E), jV j = n; semimetric D 2 Mn de ned on all pairs of vertices in G; set of taxa S V ; and a positive integer B. Question: Is there a subtree T of G that includes S such that Pfx;yg2T D(x; y) B and [ A(T)]S DS? Table1 2: Distance matrix tting decision problems (adapted from [Day83, KM86, Day87, Kri88]).... In PAGE 42: ... Question: Does there exist an ultrametric tree U 2 Un;2 such that X(D; U(U)) B? Fitting Binary Matrices to Dominant Ultrametric Trees of Height 2 VIA STATISTIC X (FBUT2[X, ]) [X 2 fF1; F2g] Instance: Set S of n taxa; semimetric D 2 Bn; and a positive integer B. Question: Does there exist an ultrametric tree U 2 Un;2 such that X(D; U(U)) B and U(U) D? Table1 3: Auxiliary decision problems for NP-hardness proofs of distance matrix tting decision problems (adapted from [KM86, Day87, Kri88]). .... In PAGE 43: ... S0 = S + yi; 1 i apos; D0 = [d0 i;j] = quot; D M M0 1 #, where M = [mi;j], mi;j = for all 1 i n and 1 j apos;, M0 is the transpose of M, and 1 is a square matrix with zeros on, but ones o , the main diagonal. B0 = B VC p m FUGT[ ] V = f g S f vi j 1 i jVVCjg S f ej j 1 j jEV Cjg D = [di;j], where d( ; vi) = 1 d( ; ej) = 2 d(vi; vj) = 4 d(vi; ej) = 1 if ej = fvi; xg 2 EV C, d(vi; ej) = 3 otherwise d(ei; ej) = 2 S = f g S f ej j 1 j jEVCjg B = K + jEV Cj Table1 4: Reductions for distance matrix tting decision problems.... In PAGE 47: ...Thesis Literature Phylogenetic UBfC,QgCS fC,QgCS [DJS86] Parsimony UBfC,QgDo fC,QgDO [DJS86] UBfC,QgCI fC,QgCI [DJS86] UBW SPQ [GF82, Day83] UUW SPP [GF82, Day83] WUOWL WTP [Day83] Character BfQ,CgC BfQ,CgC [DS86] Compatibility UfQ,CgC UfQ,CgC [DS86] Distance Matrix FBUT[F1] bHICy [KM86] Fitting FBUT2[F1] bHIC3y [KM86], 2 1y [Kri86], FUT[1] [Day87] FBUT2[F2] 2 2y [Kri86], FUT[2] [Day87] FUUT[F1] 1y [Kri86], HICy [KM86] FUUT[F2] 2y [Kri86] FUUT[F1; ] P4 [Kri88] FUDT[ ], 2 fF1; F2g FAT[ ], 2 f1; 2g [Day87] FUGT[ ] AET [Day83] Table1 5: Correspondence between phylogenetic inference problems in this thesis and problems in the literature. All solution problems are marked with daggers (y); all other problems are decision problems.... In PAGE 68: ...Unweighted Weighted Given-Cost Given-Limit Decision - NP-complete Evaluation FPNP[O(logn)]-C FPNP jj -hard y - Solution FPNP jj -hard, properly FPNP[O(logn)]-hard, 2 NPMVg FPNP 2 NPMVg Spanning 2 Span(NPMVg FPNP ) 2 SpanP Enumeration 2 FP#P Random 2 FRP p 2 Generation Table1 6: Computational complexities of phylogenetic inference functions. y Most weighted distance matrix tting evaluation problems are only known to be properly FPNP[O(logn)]-hard (see Corollary 26).... In PAGE 71: ... Formula: maxT jfcj9x[(P(c; x) ^ x 2 T) _ (N(c; x) ^ :(x 2 T))]gj where P, N, and T are as de ned for SAT. Table1 7: Formulations of SAT in rst-order logic (adapted from [KT90, PY91]). MAX NP.... In PAGE 72: ... Formula:maxT jf(x1; x2; x3)j [ ((x1; x2; x3) 2 C0 ! x1 2 T _ x2 2 T _ x3 2 T) ^ ((x1; x2; x3) 2 C1 ! x1 62 T _ x2 2 T _ x3 2 T) ^ ((x1; x2; x3) 2 C2 ! x1 62 T _ x2 62 T _ x3 2 T) ^ ((x1; x2; x3) 2 C3 ! x1 62 T _ x2 62 T _ x3 62 T) ] gj; where C0, C1, C2, C3, and T are as de ned for 3SAT. Table1 8: Formulations of SAT in rst-order logic (cont apos;d from Table 17).... In PAGE 75: ... 4. Given a solution W of cost c to an instance of SOL-MIN-FBUT2[F1] derived by the reduction from X3C given in [KM86] (See Table1 4), in polynomial time we can nd a canonical solution W0 with cost c0 c. 5.... In PAGE 75: ... 5. Given a solution W of cost c to an instance of SOL-MIN-FUDT[F ] ( 2 f1; 2g) derived by the reductions from FBUT2[ ] given in [Day87] (see Table1 4), in polynomial time we can nd a canonical solution W0 of cost c0 c.... In PAGE 77: ... As the Generalized parsimony criterion can simulate any ordered phylogenetic parsimony problem, (5) can be proved by a variant on any of the proofs for (1 - 4). Proofs of (6 { 7): By the reductions given in Table1 1, solutions to SOL-MAX-BCC (SOL- MAX-BQC) yield solutions to SOL-MAX-CLIQUE (SOL-MAX-BCC) of the same cost. Hence, these reductions yield L-reductions with = = 1.... In PAGE 77: ... Hence, this reduction is an L-reduction. Proof of (10): Consider the reduction from FBUT2[F ] to FUDT[F ] 2 f1; 2g given in [Day87] (see Table1 4). As OPTFBUT2[F ] = OPTFUDT[F ], condition (L1) is satis ed with = 1.... ..."

### Table 3. XRCC4 haplotypes

"... In PAGE 2: ... Association tests were done using all subjects (N = 1,040), and transmission tests were restricted to the subsample with relevant structure (39 trios and 167 sibships). In all analyses, the base variant with the minor allele was considered the allele of interest (see Table3 ). For age at diagnosis analyses, we restricted the sample to affected breast cancer cases only.... In PAGE 3: ...1 to breast cancer cases (TDT, 5.44; P = 0.02). All other trio, sib, and combined trio-sib TDT analyses, as well as the combined trio-sib quantitative TDT statistics, were nonsignif- icant for all other single locus analyses. Table3 lists all of the phased haplotypes and their frequencies that were observed in the subset of 94 unrelated individuals using the four XRCC4 tSNPs. There were 10 haplotypes ranging in frequency from 0.... ..."

### TABLE 2. Haplotype tagging SNPs

### Table 2 Maximum Parsimony and Likelihood Results

"... In PAGE 12: ...1 General sequencing and tree estimation results All ILD tests showed no significant incongruence between any of the partitions. All individual genes contained informative indels; the number of informative indels are shown in Table2 . All indels supported clades with high maximum parsimony boostrap (MP BP) support and did not contribute support to any weakly supported clades.... In PAGE 12: ... All indels supported clades with high maximum parsimony boostrap (MP BP) support and did not contribute support to any weakly supported clades. The aligned length of each gene partition is shown in Table2 . A 204 base pair (bp) insertion in all canid species in GHR was removed from the alignment, as was a 219 bp insertion in the red panda in RHO1.... In PAGE 12: ... In these cases, sequence was not included in the final data matrix (Table 1). For all partitions, multiple most parsimonious trees were obtained ( Table2 ). Both the GHR and complete data set MP searches were stopped due to computer memory constraints ... In PAGE 13: ... 12 retention index (RI) are reported for each partition in Table2 . Individual gene topologies (MP BP) contained polytomies but did not differ significantly from each other or from the concatenated total data MP BP topology.... In PAGE 13: ... Differences between trees were only found in areas of weak support; no hard incongruencies (opposing topologies supported by 80% bootstrap support) were observed. All maximum likelihood searches yielded a single most likely tree (log likelihoods in Table2 ), except FES, which recovered six equally likely trees. All FES trees were identical in topology and all parameters except the transition:transversion (Ti:Tv) ratio, Ti:Tv kappa, and gamma shape.... ..."