The ARTECHNE database can be browsed in various ways. The various list views and database structure allow for top-down browsing, by e.g. starting from the glossary and finding related records. The search options (normal and advanced) allow for bottom-up browsing, by searching for terms, and then narrowing down the search results by using the faceted search.
The ARTECHNE database is indexed using Solr. This allows for full-text searching the database, as well as filtering the results using facets (content type, date range, language, whether or not the record/source has a linked image) in the sidebar. Using the advanced search also allows to visualize the results of your search query. There are four options:
In the geographical visualization, records are shown on a map (according to the location of the library of the source of the record) and also on a timeline. This visualization thus allows to investigate the geographical and temporal spread of the search term in the ARTECHNE database. To get to this visualization, we have geotagged the current location of the recipes and other sources in the database. This means that we have added the coordinates (latitude and longitude) of the libraries and archives where the sources are currently kept. Over the course of the project (2016-2020), we will add geotags for the origin of each source as far as possible, and link these tags to a timeline, so you get an interactive visualization of the geographical and temporal spread of the sources in the database.
This visualization has been created using GeoTemCo.
The word cloud visualization shows the most frequent words appearing in the transcriptions of the results of your search query. The more frequent the word, the bigger the word will be in the visualization. However, the most frequent words in the results of a search query might well be frequent words in the whole corpus (e.g. function words like "the", "and", etc.), so we provide an option to normalize the frequencies using inverse document frequency. This will give a positive boost to words appearing specifically in the results of the search query, and a negative boost to words appearing a lot in the corpus as a whole. Also, the minimum number of characters for a word to appear in the word cloud can be set, which might also remove unwanted words from the visualization (most function words are rather short). The word cloud frequencies can be exported to .csv-format for further processing. Note that for the best results, you will want to fix the language filter.
This visualization has been created using Jason Davies word cloud generator.
The collocation visualization shows the words appearing close to the search term in the transcriptions. This allows to see which terms co-occur. The visualization is similar to the word cloud visualization, but collocations focus on appearance close to the search terms, while with word clouds the found words may appear anywhere in the document. Note that for the best results, you will want to fix the language filter.
The text reuse visualization allows to see which texts overlap. Because this is coded as a separate application, you will need to first download your search results in .csv-format from the "List"-tab, and then upload this downloaded .csv-file on the "Text reuse" tab. After uploading the file, the application will show the records with having the most overlap in their transcriptions (this is calculated using Jaccard similarity). If you click on a row in the table, you can see the transcriptions of the records, as well as the longest alignment between the two.
The text reuse application is heavily dependent on the rOpenSci textreuse R package.
The source code to the visualizations are part of a Drupal Bootstrap sub-theme tailor-made for the ARTECHNE-project and can be found on GitHub. The text reuse component is a Shiny application, the source code can soon be found on GitHub as well.