“Engraved on the tablets” (Exodus 32:16); do not read “engraved” (harut) but rather “freedom” (herut).
Mishnah Avot 6:2
In his introduction to his enigmatic and category-defying treatise, Sefer ha-Yashar, Rabbenu Tam (France, twelfth century) vehemently attacks proofreaders of the Talmud who alter sayings of Tannaim (sages of the mishnahic period, pre–210 CE) and Amoraim (rabbinic sages who lived ca. third to fifth centuries CE) and correct them; in doing so, he warns, they blur the boundaries between interpretation— however brilliant—and the original version of the text, which must remain as it was. After all, as every philologist knows, the more difficult version is often the key to a correct interpretation (lectio difficilor), and therefore should not be altered.
This position causes something of a family complication: Rashi, the grandfather of both Rabbenu Tam and the Rashbam (Rabbi Shmuel), is known for supposedly proofreading many talmudic sugyot (units). Rabbi Shmuel continued on the same path as his grandfather, as Rabbenu Tam notes: “for every one [sugya] proofread by Rabbenu Shlomo (=Rashi) he (=Rabbi Shumel) has proofread twenty, and not only that, but he has erased [original versions in] the books.” While Rabbenu Tam defends Rashi—asserting that his grandfather almost always included the proofreading in his commentary and left the original unchanged—he cannot do the same for his brother, the Rashbam. By contrast, later evidence indicates that when Rabbenu Tam proofread texts, he used special graphic marks to distinguish his corrections from the text he received, although those marked-up works have not survived.
This controversy between the eminent medieval commentators of the Talmud is not just an intellectual family drama. It also reveals a range of possible positions on the relation between the text in its original form and its later interpretive cloak: from maintaining a safe distance between the source and its interpretation, through intervening only when necessary, to an intensive editing of the text. It also reminds us that an annotation is a medium by which the commentators not only clarify the text, but converse with one another. Looked at from this point of view, their discussion touches on the limits of the medium of interpretation and its historic role.
Undoubtedly, this argument is not unique to the rabbinic learning tradition. It is characteristic of most intellectual processes, cultures, and disciplines. However, interestingly enough, when we transfer it to our digital age, and to the language of the movement (or discipline) known as Digital Humanities, it seems that the usual hierarchy between the source and the keys to its understanding undergoes a transformation that should not be underestimated. Translating Rabbenu Tam’s position into contemporary terms, we might say he insists on the importance of differentiating between data and metadata; if the talmudic text is data, then all its commentaries, including the textual alterations and proofreadings, are metadata. The first must remain original, and preserve its position in the clear hierarchy of knowledge, while the other must accept its secondary status.
But now, the wall of defense put up by this approach, and what had seemed the simple, natural, and proper differentiation between the various layers of knowledge, is beginning to fall apart. Anyone who has learned to value the original untouched source, who respects the idiosyncratic nature of certain cultural phenomena, who lives peacefully with unsolved riddles, who understands the hermeneutic process as a chain of shifting responses to the object of interpretation—is about to discover that one of the first tasks in almost any attempt to integrate an object within the digital sphere involves subjecting the object to some kind of framework, or connecting the object to other objects. Data, more than ever before, cannot remain in its raw state.
From the computer’s point of view—if you may pardon the simplistic personification—the content we process through it has no meaning. In order for it to know what to do with such content, the computer requires keys. Thus, if it is a text, it must know where the text begins and where it ends, what a title is, and what a quote is—and this is before we get to any of the less formal, and more interpretationdependent aspects of the text. For that we supply the computer with information about data, or, to put it crudely, metadata. We aim for classification; and, as digital humanists like to say, we give structure to unstructured material by diverse means: cataloging it in the correct location in the database, inserting it into the rigid splint of datasheets, identifying it by designated tags, annotating it using a complex system of symbols, noting its location within time and space, and so on. If we don’t do all this, the text will remain unusable.
The computer remains indifferent to all this. As far as the computer is concerned, we have linked one thing to another, but neither has any meaning— in the deep sense of the word—for the computer itself. Meaning is only for us, the human beings: our colleagues and we can now work with the material; data attached to metadata represent a whole to which we can now relate, something that we can read and analyze. And, even better, we and our colleagues can now work on it together and collaborate: read it afresh, connect the material with other “wholes,” share it, pass it on, and so on.
The dependence of data on metadata has grown in the digital age, therefore, in comparison with the predigital age. And, perforce, the dependence of data on other sources, which the metadata serves to connect it to, has grown as well. The key word here is standardization: metadata functions efficiently only when it is accurate, consistent, relevant to more than one source, and agreed on by all. In the absence of these conditions, the data loses its value.
We must be thankful for the enormous effort invested in standardization; it enables us not only to calmly switch computers and technologies, but also to send each other files, to find the right book in the library (or, the right paper in the online database), to compare objects, and to relate to every item of knowledge, in the same way exactly.
But there is another side to the coin, to which Rashi’s interesting case bears witness. In contrast to the medieval talmudic scholars known as the Tosafists, of whom Rabbenu Tam was the outstanding representative, Rashi refrained in his talmudic commentary from too broad a standardization, one that would become inevitably a kind of abstraction. His interpretive enterprise for each sugya (unit) confines itself to the limits of that sugya, or at the most, the limits of the chapter or the tractate—he almost never compares his own understandings in other contexts. His commentary, therefore, could be considered as a local annotation, rather than consistent metadata (which may be somewhat identified with the Tosafists’ work), and, as such, he does not tend to impose on a source insights that may have been formed by reference to another source.
However, as learners of the Talmud know, and as Rabbenu Tam writes, when proofreading, boundaries become obscured. Although Rashi included his corrections in his commentary, usually advancing them with the expression hakhi garsinan (so goes our version), other students who followed him included the corrections in the text itself, and today it is often almost impossible to reconstruct the version Rashi had. This development reflects something of the fixed dynamic between the Written Torah and the Oral Torah. Sometimes, as we see, the Oral Torah that should have accompanied the written one, and supplied the keys for its understanding, has become fossilized; the commentary has replaced the source. As it turns out, metadata, even if local and inconsistent, can also become canonized, appearing as part of the original text, as if it had always been there.
This is forgivable when the interpretation is trivial and obvious. When the primary source is more intractable, however, it can become more problematic. Many studies are already realizing the enormous potential in “distant reading,” shedding brilliant light on “the great-unread” and uncovering what goes on beyond the borders of the familiar canon. Without the adequate tools, however, such studies are sometimes given to “clustering,” “grouping,” and “smoothing,” that is, everyday computational practices aimed at obscuring differences and minimizing the effect that very exceptional cases can have on patterns. To move ahead, new tools—new concepts—need to be found, not only to cope with those statistical patterns already inscribed on the tablets, as it were, but also to handle those cases that require a greater degree of exegetical freedom.