Like many users, my AncestryDNA match list is filled with testers without trees. Over the years, I’ve built trees for matches I know in real life and those I communicated with online. Sleuthing skills helped me fill in the gaps on some unresponsive matches. But even after all my efforts, about a third of my closer matches (2nd – 3rd cousins) remain a mystery.
Then Dana Leeds introduced her color clustering technique to the Genetic Genealogy Tips & Techniques Facebook group. I was eager to try it, especially on my father’s side where I have a couple long-standing brick walls. My paternal side also has quite a bit of intermarriage among four key families, and I hoped color clustering might prove a nice way to illustrate our complex family.
I followed the instructions for clustering 2nd – 3rd cousins (those matches sharing between 90 – 400 cM) on my paternal side, and my result was not four nicely sorted columns. I expected it to be a little messy — but 10 columns was more complicated than I anticipated:
I sought Dana’s advice at her presentation to Houston Genealogical Forum’s DNA special interest group earlier this month. While she hasn’t extensively tested this method with endogamous populations or families with pedigree collapse, Dana suggested flipping the match list and clustering from lowest to highest shared cM. I tried her suggestion, and the 12-column result was unfortunately just as confusing:
I had some success on my maternal side by removing the “problematic matches” — those testers who match me in more than one way — and then clustering. However, the problematic matches on my paternal side are 80% of the list. From both attempts, I can clearly identify the clusters related to my 2x-great-grandfather Joshua Lawrence Horne, but all the other families — Johnston, Smart, McMurry, and McKaskle — are extremely mixed.
To illustrate, I prepared this simple family tree of my Johnston, Smart, McMurry, and McKaskle family and the intermarriages among these families. I then plotted my top AncestryDNA matches on the chart and realized seven (!!) of my top ten are involved in this tangled web. No wonder my color cluster is a big blob!
As I’ve reflected on my color clustering results, I’ve come to the following conclusions:
Clustering will likely be difficult because of my grandparents’ shared Smart family connection (unknown relationship).
Close matches that would typically be helpful in sorting/filtering/clustering have multiple shared ancestors, eliminating them as useful “constants” for comparison.
Because of intermarriage, testers who only match my father through one ancestor couple likely exist at the 4th cousin level or greater. Unfortunately, up to half of 4th cousins will not share enough DNA to show as a match according to ISOGG statistics.
I may not have enough testers on desired family branches to be helpful in clustering.
Pursue DNA testing of these family lines:
Descendants of William Silas Johnston & Harriett Johnston (Johnston double-cousins)
Descendants of James Monroe McKaskle who did not intermarry with other family lines — Nancy Bell McKaskle, Willie Keiffer McKaskle, Sr.
Descendants of “lost siblings” of John McMurry from 1860 census.
Attempt a 4th cousin-only color cluster. Capturing data from cousins “less intermarried” may result in clearer clusters.