Transforming scholarship in the archives through handwritten text recognition: Transkribus as a case study

in publications :: #JournalOfDocumentation by Günter Mühlberger, Louise Seaward, Melissa Terras, Sofia Ares Oliveira, Vicente Bosch, Maximilian Bryan, Sebastian Colutto, Hervé Déjean, Markus Diem, Stefan Fiel, Basilis Gatos, Albert Greinoecker, Tobias Grüning, Günter Hackl, Vili Haukkovaara, Gerhard Heyer, Lauri Hirvonen, Tobias Hodel, Matti Jokinen, Philip Kahle, Mario Kallio, Frédéric Kaplan, Florian Kleber, Roger Labahn, Eva Maria Lang, Sören Laube, Gundram Leifert, Georgios Louloudis, Rory McNicholl, Jean-Luc Meunier, Johannes Michael, Elena Mühlbauer, Nathanael Philipp, Ioannis Pratikakis, Joan Puigcerver Pérez, Hannelore Putz, George Retsinas, Verónica Romero, Robert Sablatnig, Joan-Andreu Sánchez, Philip Schofield, Giorgos Sfikas, Christian Sieber, Nikolaos Stamatopoulos, Tobias Strauß, Tamara Terbul, Alejandro Héctor Toselli, Berthold Ulreich, Mauricio Villegas, Enrique Vidal, Johanna Walcher, Max Weidemann, Herbert Wurster, Konstantinos Zagoris

Purpose An overview of the current use of handwritten text recognition (HTR) on archival manuscript material, as provided by the EU H2020 funded Transkribus platform. It explains HTR, demonstrates Transkribus, gives examples of use cases, highlights the affect HTR may have on scholarship, and evidences this turning point of the advanced use of digitised heritage content. The paper aims to discuss these issues.

Design/methodology/approach This paper adopts a case study approach, using the development and delivery of the one openly available HTR platform for manuscript material.

Findings Transkribus has demonstrated that HTR is now a useable technology that can be employed in conjunction with mass digitisation to generate accurate transcripts of archival material. Use cases are demonstrated, and a cooperative model is suggested as a way to ensure sustainability and scaling of the platform. However, funding and resourcing issues are identified.

Research limitations/implications The paper presents results from projects: further user studies could be undertaken involving interviews, surveys, etc.

Practical implications Only HTR provided via Transkribus is covered: however, this is the only publicly available platform for HTR on individual collections of historical documents at time of writing and it represents the current state-of-the-art in this field.

Social implications The increased access to information contained within historical texts has the potential to be transformational for both institutions and individuals.

Originality/value This is the first published overview of how HTR is used by a wide archival studies community, reporting and showcasing current application of handwriting technology in the cultural heritage sector.

BibTex | DOI: 10.1108/JD-07-2018-0114


Convolutional Attention on Images for Locating Macular Edema

in publications :: #MIUA by Maximilian Bryan, J. Nathanael Philipp, Gerhard Heyer, Matus Rehak, Peter Wiedemann

Status published Summary:

Neural networks have become a standard for classifying images. However, by their very nature, their internal data representation remains opaque. To solve this dilemma, attention mechanisms have recently been introduced. They help to highlight regions in input data that have been used for a network’s classification decision. This article presents two attention architectures for the classification of medical images. Firstly, we are explaining a simple architecture which creates one attention map that is used for all classes. Secondly, we introduce an architecture that creates an attention map for each class. This is done by creating two U-nets - one for attention and one for classification - and then multiplying these two maps together. We show that our architectures well meet the baseline of standard convolutional classifications while at the same time increasing their explainability.

BibTex | DOI: 10.1007/978-3-030-39343-4_33


Auto encrypt all Incoming Email with postfix

in projects :: #admin, #coding, #gpgmail, #pgp

I am running my own mail server for a while now. Since the beginning I was thinking about how to store the mails encrypted, so that no one can read the mails with access to the server. The solution I came up with is relative easy to setup and is based upon OpenPGP/GnuPGP.

The basic idea is to take incoming mail before it is stored and encrypt it. I'm running postfix, which has the option to filter queued mails with external content filters. A content filter gets a mail via stdin, does whatever it needs to do and either rejects a mail or put it back into the mail queue.

I wrote a relativ simple Python script that takes a mail from stdin, processes it and then writes it back to stdout. The script can either decrypt, encrypt, sign or sign and encrypt a mail. It also tries to protect the mail headers following the memoryhole specs and supports Thunderbirds/Enigmails encrypted subject feature. The drawback is that Enigmail only supports the encrypted header from the memoryhole specs and other mail clients don't support them at all. For the content_filter in postfix I wrote a Bash script, that will resend the encrypted mail to put it back into the mail queue. The scripts can be found on GitHub.

Setup

  1. Install gpgmail
  2. Add a new user:

    adduser --shell /bin/false --home /home/gpgmail --disabled-password --disabled-login --gecos "" gpgmail
    
  3. Create .gnupg folder and change permissions:

    mkdir /home/gpgmail/.gnupg
    chown gpgmail:gpgmail /home/gpgmail/.gnupg/
    chmod 700 /home/gpgmail/.gnupg/
    
  4. If mails should not just get encrypted but also signed, create a new key pair:

    sudo -u gpgmail /usr/bin/gpg --homedir=/home/gpgmail/.gnupg --expert --full-gen-key
    
  5. Import public keys and chnage trust:

    sudo -u gpgmail /usr/bin/gpg --homedir=/home/gpgmail/.gnupg --import /home/gpgmail/pubkey.asc
    sudo -u gpgmail /usr/bin/gpg --homedir=/home/gpgmail/.gnupg --edit-key <KEY> trust save
    sudo -u gpgmail /usr/bin/gpg --homedir=/home/gpgmail/.gnupg --edit-key <KEY> trust quit
    
  6. Edit /etc/postfix/master.cf

    :::unixconfig
    smtp          inet  n       -       y       -       -       smtpd -o content_filter=gpgmail-pipe
    smtps         inet  n       -       y       -       -       smtpd -o content_filter=gpgmail-pipe
    submission    inet  n       -       y       -       -       smtpd -o content_filter=gpgmail-pipe
    gpgmail-pipe  unix  -       n       n       -       -       pipe
      flags=Rq user=gpgmail argv=/usr/bin/gpgmail-postfix sign-encrypt gnupghome=/home/gpgmail/.gnupg key=<KEY_ID> passphrase=<PASSPHRASE> encrypt-subject -oi -f ${sender} ${recipient}
    
  7. Restart postfix.

Sources


Evaluation of CNN architectures for text detection in historical maps

in publications :: #DATeCH

We evaluate different densely connected fully convolutional neural network architectures to find and extract text from maps. This is a necessary preprocessing step before OCR can be performed. In order to locate the text, we train a neural network to classify whether a given input is text or not. Our main focus is on the output level, either classifying text or no text for the whole input or predicting the text position pixel wise by outputting a mask. Acquiring enough training data especially for pixel wise prediction is quite a time consuming task, so we investigate a method to generate artificial training data. We compare three training scenarios. First training with images from historical maps, which is quite a small dataset, second adding artificially generated images and third training just with the artificially generated data.

Full Abstract | Poster


Zum Beispiel könnte man…

in misc :: #bücher

Zum Beispiel könnte man alle nationalen Pässe durch einen Europäischen Pass ersetzen. Ein Pass der Europäischen Union, in dem der Geburtsort vermerkt ist, aber nicht die Nationalität. Ich glaube, dass allein dies etwas im Bewusstsein der Generation bewirken würde, die mit einem solchen Pass aufwächst. Und das würde nicht einmal etwas kosten. […] Aber das ist nicht genug, setzte er fort.

Die Hauptstadt von Robert Menasse, Seite 392


Objdetect: Eine Plattform zur Visualisierung von Vorhersagen objekterkennender neuronaler Netze

in publications :: #DHDL

Im Rahmen unterschiedlicher Projekte und Qualifizierungsarbeiten entstanden und entstehen eine Reihe neuronaler Netze. Diese Netze dienen entweder der Klassifizierung von Bildern oder dem Auffinden von Objekten in Bildern. Um diese neuronalen Netze besser miteinander hinsichtlich unterschiedlicher Architekturen, unterschiedlicher Hyperparameter oder unterschiedlicher Trainingsdaten zu vergleichen und sie einem breiteren Personenkreis verfügbar zu machen, wurde eine Webseite entwickelt, auf der man trainierte neuronale Netze und Bilder hochladen, eine Objekterkennung starten und die Ergebnisse einheitlich visualisieren kann. Die Webseite soll sowohl für Entwickler neuronaler Netze sein, die Ihre Netze vergleichen wollen, als auch für Forscher, die auf ihren Bildern eine Objekterkennung durchführen wollen.

Poster | https://fdhl.info/dhdl-2018-materialien/


Generierung von Trainingsdaten für die Handschrifterkennung aus TEI annotierten Dokumenten – Ein Erfahrungsbericht aus dem EU-Projekt READ

in publications :: #INF-DH by Maximilian Bryan, Tobias Hodel, Nathanael Philipp

Zum Trainieren maschineller Lernverfahren zur Erkennung von Handschriften werden Textdaten mit korrespondierenden Bildern benötigt. Die Textdaten liegen häufig im TEI-Format das diverse Möglichkeiten eröffnet, um textuelle und semantische Phänomene auszuzeichnen, weiter können gar eigene Tags oder Auszeichnungsarten eingeführt werden. In diesem Beitrag wird ein im EU-Projekt READ entwickeltes parametrisierbares Tool beschrieben, das mit unterschiedlichen Auszeichnungsstilen in TEI umgehen kann und Textdateien auf Seitenbasis liefert, die zur Zuordnung von Text zu Bilddaten (text-to-image) genutzt werden können und somit zur Aufbereitung von Trainingsdaten für Modelle der Handschriftenerkennung dienen. Die gezeigten Beispiele und Anwendungen stammen alle aus Projekten, die ihre Daten für READ zur Verfügung stellten.

BibTex | DOI: 10.18420/infdh2018-11



Masterthesis: Objekterkennung mit Hilfe von Convolutional Neural Networks am Beispiel ägyptischer Hieroglyphen

in publications :: #thesis

en | de

In this thesis different architectures of Convolutional Neural Networks (CNN) and their suitability for object recognition were investigated by using the example of Egyptian hieroglyphs.

First, basic principles for artificial neural networks and the components of CNNs, such as convolutional layers, are introduced and explained, followed by explanations of the used data sets and the associated difficulties. We then present libraries for the concrete implementation and use of artificial neural networks and describe the sequence of the evaluation of object recognition.

The CNNs were trained and evaluated with different numbers of classes and the associated number of images. The experiments are divided by the three used training methods. For the first, the CNNs were trained with the help of autoencoders. For the second, the CNNs were trained block-wise, and for the third deeper network architectures were investigated. Various network architectures such as Residual Networks (ResNet) and Densely Connected Convolutional Networks, described in the literature, were implemented and evaluated.

The results of the experiments show that it is possible to train CNNs with up to 69 convolutional layers, to classify about 6500 different Egyptian hieroglyphs and finally to carry out an object recognition with very good results. The best results for object detection 0.92362 were achieved for a CNN with 6465 classes and 13 convolutional layers.

BibTex | PDF


Vulnerabilities and other stuff

in misc :: #coding, #stuff

I recently read an interresting post about the target="_blank" vulnerability. This vulnerability leaves a user open to a very simple phishing attack and is quite unknown. When a link uses the target="_blank" attribute not accompanied with the rel="noopener" attribute or in the case of Firefox rel="noopener noreferrer" the opening site gives the new site access to the existing window through the window.opener API, allowing a few permissions. Some of these permissions are automatically negated by cross-domain restrictions, but window.location is fair game.

To see this vulnerability in action you can use this link. It'll open the post in a new tab/window and redirect this window to an other page.

The code below shows the necessary code for the window.opener API to redirect the opening site to a new location.

:::javascript
if ( window.opener ) {
    window.opener.location = "https://jnphilipp.org/pages/page/gone-phishing/?referrer=" + document.referrer;
}

Because of that post, I removed all target="_blank" attributes from the links. I had also a few other changes that had pilled up and which I hadn't gotten around to put online. Most are on the back end side. On front end side I changed manly the color of the sidebar.