Stylometry: Difference between revisions

From Clarin K-Centre
Jump to navigation Jump to search
(styloscope toegevoegd)
 
Line 6: Line 6:


*[https://stylene.uantwerpen.be/ Demo]
*[https://stylene.uantwerpen.be/ Demo]
== Styloscope ==
Styloscope is a tool for '''automatic writing style analysis'''. It can be used to test hypotheses about large-scale corpora, parse documents, or detect outliers. Users can provide data by either uploading a local file or by using a publicly available Huggingface dataset. When uploading a corpus, the tool accepts CSV files with one document per row, and ZIP folders in which documents are stored in individual text files. The output contains the parsed documents, raw statistics on various writing style features such as syntactic dependencies, lexical richness,  readability, etc., and visualizations of aggregated results.
* [https://github.com/clips/styloscope Github]

Latest revision as of 14:13, 29 May 2024

Stylene

Stylene is a robust, modular system for stylometry and readability research on the basis of existing techniques for automatic text analysis and machine learning, and the development of a web service that allows researchers in the humanities and social sciences to analyze texts with this system.

In this way, the project will make available to researchers recent advances in research on the computational modeling of style and readability. The system was developed in a cooperation between the CLiPS (University of Antwerp) and LT3 (Ghent University) research groups.

Styloscope

Styloscope is a tool for automatic writing style analysis. It can be used to test hypotheses about large-scale corpora, parse documents, or detect outliers. Users can provide data by either uploading a local file or by using a publicly available Huggingface dataset. When uploading a corpus, the tool accepts CSV files with one document per row, and ZIP folders in which documents are stored in individual text files. The output contains the parsed documents, raw statistics on various writing style features such as syntactic dependencies, lexical richness,  readability, etc., and visualizations of aggregated results.