#### Visualizing Jewish history through the aggregate manuscripts of the Cairo Genizah

##### The Project

Utilizing the advents of data science this project demonstrates a new methodology for studying the Cairo Genizah documents. Here is the first time this data set has been processed on such a comprehensive scale. The following interactive graphs are likely the first visualizations of the millennia old network contained within the Cairo Genizah.

##### Treatment of God's Name

In Jewish law the reverence for God extends to the written word and any document containing his name is never to be treated as trash. Many medieval correspondence would include a prayer and were thus never disposed of.

##### A Genizah

A Genizah is a storage space where texts and manuscripts that contain god’s name are stored after their useful life. This space for indefinite storage is often in a synagogue or cemetery.

##### The Cairo Genizah

The Jewish community of Fustat (present day Cairo, Egypt) amassed a collection of approximately 200,000 documents that remained relatively preserved in the arid desert climate. These documents provide an unfiltered history of Medieval Jewish people.

##### My approach

During my spring 2014 semester I took two inspiring classes in both discrete mathematics and the history of Jews living in Islamic lands. During that semester I became fascinated by the Cairo Genizah and the scholars who dedicated themselves studying it’s documents. At the same I was learning about the power and versatility of discrete mathematical structures.

Along the way I realized the enormous potential of the unexplored structures contained in the Genizah documents. I subsequently reached out to the Mathematics department chair and set-up an independent study focused on using the tools of discrete mathematics to study the Genizah documents. I presented those results in a paper and later created this webpage to share my findings.

##### The Process

The basis for these networks was scribed thousands of years ago and since undergone many transformations. The steps of exchange, reformatting, and re-purposing of this data can be understood through the flow chart on the right.

##### Visualizing Avenues of Communication

In the following graph each edge represents a document with nodes corresponding to its author and addressee. To draw a graph that is both comprehensive and meaningful multiple algorithms used to optimize the graphs layout. comprehensive and meaningful.

Click for more explanation.

Click for less explanation.

##### More on Algorthims for Avenues of Communication

In data such as the Cairo Genizah documents, communication often spans multiple generations. It seems important to have a method for highlighting the series of correspondences that span the most generations. That is to say it is important to have a method for demonstrating where ideas originate, a method for highlighting the connections that initiate the spread of an idea.

To interpret this concept in discrete mathematics we must think of the shortest path as being the same as a purely efficient path (the shortest path between two nodes is also the most efficient). Accordingly we are looking for the longest purely efficient path. The following algorithm for finding such paths was developed for that purpose. The example graphs demonstrate how this algorithm works on a small data set with edge color corresponding to edge weight.

###### Static Graph

###### Dynamic Graph

##### Connecting Language and Subject

The many languages and subjects contained in the Cairo Genizah Documents give insight into the nature of medieval dialects. The following graph was drawn with each document forming an edge between nodes representing that documents predominant language and subject.

Click for more explanation.

Click for less explanation.

##### More on Language and Subject

The methodology may first be understood through this example. Here Document I is commentary on Exodus and written in Judeo-Arabic. Document II is a divorce Deed written in Judeo-Arabic. Document III is an exodus Commentary written in Hebrew.

These graphs are bipartite since there exist two independent groups of nodes (languages and subjects) such that the nodes in each group have no common edge.

###### Static Graph

###### Dynamic Graph

##### Past and Ongoing

There remains a great deal more to learn about the structures of these documents. The code for these graphs is freely available at my GitHub repository and I plan to continue this research with experts in the field of the Cairo Genizah.

Below are PDF links to the two papers I’ve written on this subject. The first paper (left) is historically based and examines the scholars who pioneered this field. The second (right) is a report of my findings in the application of data science to this field. Those interested in collaborating should contact me.