Collaborative Research Practice / Platforms

an utopian project

Is it possible to create a platform that allows students, scholars and projects to collect, share and publish on the web historical qualified information?

Aims:
1) enhance the individual efforts of collecting qualified information that can be reused for future projects
2) grant the durability of collected information by improving and developing it in a collaborative system: e.g. websites, publishing a corpus
3) share information and collect a large amount of data that allow a new and broader level of study of past societies

Realisation:
1) Which projects aim to realize or already implement this utopian idea?
Heuristscholar (Ian Johnson, Sidney) Pinakes (Andrea Scotti, Florence)
2) Which methods or standards can be chosen for data modeling? (ERD/UML, RDF, TEI, …)
3) Which technologies can be used to store, analyze and publish the collected data? (Database – sml encoded text – semantic web, ontologies) -> metadata

Francesco’s Project
Plan d’ exposée
I. L’invention d’une méthode: de l’information historique aux données
II. SyMoGIH: ses “niveaux”, ses utilisateurs
III. Un exemple d’exploitation: exploration des données du projet Scholastein

La modélisation: une opération scientifique
1) Comment transformer les informations historiques en données
– qu’on peut stocker sous format digital
– puis exploiter avec différents logiciels?

2) Choix d’une technologie: base de données relationnelle, document XML, RDF, …
– choix en fonction de la problématique du projet
– combiner les technologies
– adopter les standards et suivre les bonnes pratiques

3) choix en fonction de la problématique

4) l’opération scientifique réside dans l’application de la méthode
– choisir une approche: modéliser le contenu d’une source ou un monde historique -> What should you model now?

If you are modeling, you are already choosing what your are modeling
– distinguer entre la problématique de recherche et l’objectivité de l’information: une mission impossible
– gérer le “sourçage” de l’information –> One should always tell, where the information came from and one should be able to verify it
– create a semantic

éléments d’une sémantique
– définit des objets historiques représentés par des entités
– définit leurs relations possibles: nom de l’association, propriétés cardinales…

modelize the content of the document or their relations??
every information is a source (metamodel)
La sémantique -> brings one back to our usual way of work in “History”

Open system in databases

Question: different actors (roles) objects and information (link between the objects)
binary system?
It is difficult to have a single datamodel?

Objects – Roles – Information
Level of users
premier niveau: une méthode de modélisation
deuxième niveau: une base de donnée se référant
troisième niveau: une base donnée …collectif

Who maintains the data? Who is able to enter data?

Scholasticon
Text: only search for text, define a population,define what you want to know about the teachers
(geografical information system software= classical open source software) -> analyzation

Questions to the “public”:
Who is using similar methods, who is working on a similar project, etc.

copyright as a limitation to Collaborative usage of databases
effort to get into this direction: share documents but still set individual data

Francesco’s database is NOT public (due to copyright infringement problems and the aim of the project to allow access only to those, that contribute) -> Can this be called Collaborative Research? Does it need to be open?

What is the open part of the project, what is the closed part of the project?
-> They want to publish a certain amount of the data soon
In the general database you have individual information which have to be marked to be shown.

Modeling the data is the biggest problem of Fancesco’s project

websites dedicated to different projects
around 30 people are using the database. They would like to improve but are lacking resources
modeling the data is the biggest problem of Fancesco’s project.

Heurist
Pinakes

Is it worthwhile to work with only three tables (role, source, information)? What about changing interpretations, perception, etc… Database seems limited to “facts”
three different documents give three different interpretation of the information. pinake does link this information.

Volunteers as a means of improving your system -> volunteers are not wanted, as they are interested in using the tool as historians not focusing on the system
Is this a long term approach?

Who dares to share databases (historians)?
Sharing is not considered a way of publication and is therefore not valued

using the digital humanities tools can have the result, that you see more than the publisher of the data sees.

Virtual manuscripts und für den Teil der Stiftsbibliothek St. Gallen
using and harvesting the metadata…

discussion closed 😉

Leave a Reply