DNA Analysis – Page 2

Like many users, my AncestryDNA match list is filled with testers without trees. Over the years, I’ve built trees for matches I know in real life and those I communicated with online. Sleuthing skills helped me fill in the gaps on some unresponsive matches. But even after all my efforts, about a third of my closer matches (2nd – 3rd cousins) remain a mystery.

Then Dana Leeds introduced her color clustering technique to the Genetic Genealogy Tips & Techniques Facebook group. I was eager to try it, especially on my father’s side where I have a couple long-standing brick walls. My paternal side also has quite a bit of intermarriage among four key families, and I hoped color clustering might prove a nice way to illustrate our complex family.

I followed the instructions for clustering 2nd – 3rd cousins (those matches sharing between 90 – 400 cM) on my paternal side, and my result was not four nicely sorted columns. I expected it to be a little messy — but 10 columns was more complicated than I anticipated:

Color Clustering - Traditional — Result of Leeds Color Clustering method on my paternal DNA matches *(clustering from highest-to-lowest shared CM)*

I sought Dana’s advice at her presentation to Houston Genealogical Forum’s DNA special interest group earlier this month. While she hasn’t extensively tested this method with endogamous populations or families with pedigree collapse, Dana suggested flipping the match list and clustering from lowest to highest shared cM. I tried her suggestion, and the 12-column result was unfortunately just as confusing:

Color Clustering - Backward — Result of Leeds Color Clustering method on my paternal DNA matches *(clustering from lowest-to-highest shared CM)*

I had some success on my maternal side by removing the “problematic matches” — those testers who match me in more than one way — and then clustering. However, the problematic matches on my paternal side are 80% of the list. From both attempts, I can clearly identify the clusters related to my 2x-great-grandfather Joshua Lawrence Horne, but all the other families — Johnston, Smart, McMurry, and McKaskle — are extremely mixed.

To illustrate, I prepared this simple family tree of my Johnston, Smart, McMurry, and McKaskle family and the intermarriages among these families. I then plotted my top AncestryDNA matches on the chart and realized seven (!!) of my top ten are involved in this tangled web. No wonder my color cluster is a big blob!

Johnston-Smart-McKaskle-McMurry Intermarriage — **Intermarriage of Johnston, Smart, McKaskle, and McMurry Families** *(highest AncestryDNA matches plotted with dotted lines)* [download PDF]

As I’ve reflected on my color clustering results, I’ve come to the following conclusions:

Clustering will likely be difficult because of my grandparents’ shared Smart family connection (unknown relationship).
Close matches that would typically be helpful in sorting/filtering/clustering have multiple shared ancestors, eliminating them as useful “constants” for comparison.
Because of intermarriage, testers who only match my father through one ancestor couple likely exist at the 4th cousin level or greater. Unfortunately, up to half of 4th cousins will not share enough DNA to show as a match according to ISOGG statistics.
I may not have enough testers on desired family branches to be helpful in clustering.

Next Steps:

Pursue DNA testing of these family lines:
- Descendants of William Silas Johnston & Harriett Johnston (Johnston double-cousins)
- Descendants of James Monroe McKaskle who did not intermarry with other family lines — Nancy Bell McKaskle, Willie Keiffer McKaskle, Sr.
- Descendants of “lost siblings” of John McMurry from 1860 census.
Attempt a 4th cousin-only color cluster. Capturing data from cousins “less intermarried” may result in clearer clusters.

Category: DNA Analysis

Georgia F. Smart Horne: Research Challenge Who Faced Personal Challenges

What a Tangled Web We Weave: Exploring Color Clustering with My Complicated Family