In this thesis different machine learning algorithms are evaluated for the task of multi-label classification. The evaluation is done with the binary classifiers naive Bayes and support vector machine (SVM) and the multi-class classifier supervised latent Dirichlet allocation (SLDA). To enable naive Bayes and SVM to do multi-label classification the RAkEL transformation is used and for SLDA a topic model multi-label learner is developed and used.
The Reuters-21578 corpus is used. Since not all texts have labels and not all labels occur in sufficient frequency a selection of texts was used. Two corpora were created and used for classification.
The classification results show that the best results are archived with SVM. Naive Bayes and SLDA give very similar results, but SLDA has a very long runtime.
BibTex | PDF
I recently read an interresting post about the target="_blank"
vulnerability. This vulnerability leaves a user open to a very simple phishing attack and is quite unknown. When a link uses the target="_blank"
attribute not accompanied with the rel="noopener"
attribute or in the case of Firefox rel="noopener noreferrer"
the opening site gives the new site access to the existing window through the window.opener
API, allowing a few permissions. Some of these permissions are automatically negated by cross-domain restrictions, but window.location
is fair game.
To see this vulnerability in action you can use this link. It'll open the post in a new tab/window and redirect this window to an other page.
The code below shows the necessary code for the window.opener
API to redirect the opening site to a new location.
:::javascript
if ( window.opener ) {
window.opener.location = "https://jnphilipp.org/pages/page/gone-phishing/?referrer=" + document.referrer;
}
Because of that post, I removed all target="_blank"
attributes from the links. I had also a few other changes that had pilled up and which I hadn't gotten around to put online. Most are on the back end side. On front end side I changed manly the color of the sidebar.
I must sadly announce the end of life for TIMA. Or at least the end of the TIMA website at https://tima.jnphilipp.org. This is due to the practically non existent traffic and my inability to maintain the site. The EOL will be at the end of the month, the 30th of September 2016. I will upload a database dump with the associations to this post after the shutdown.
Update: So the EOL of the TIMA website is reached. As promised a dump of the associations can be downloaded here as a JSON-file. For each word the language, count, identifier and associations are given, here the count indicates how often the word was answered. An association has the same informations, but here the count indicates how often the association was given to the word.
Over the last few weeks I added a few new features. The most extensive feature I added is the API. The API consists of two parts, the first is to retrieve the posts and projects as JSON. The other is an OAI-PMH endpoint, which returns XML. At the moment I only support the metadata in the Dublin Core format, but I plan to add CMDI. For details on the API I added a page to the project section. The second feature I added was inspired by this post about signing web content using PGP. I added signatures to the posts and projects which can be view in the source code and verified using my public key or with Keybase. On a side note, I got new certificates from Let’s Encrypt and forcing HTTPS now.
Since my last Post about TIMA a few thing have happened and changed. We added and FAQ page, and most noticeably a we added a section with games to the Website. Currently there is only one: AssociationChain.
AssociationChain is a simple game in which you and TIMA build an association chain together. The rules are as follows: You and TIMA alternately associate a word to the previous association. The goal is to build long chains.
As for the Apps it's a work in progress. The basic functionally of the Website is in the App and works we'll see that rest get's into it an that we can distribute it. As for the game Apps that will take some more time.
As for the TIMA itself. We currently support four languages: German, English, Spanish and Farsi. We have a total of over 3400 words and over 4000 unique associations.