Sunday, December 18, 2005

New tables due to many new ideas and other novelties

Ultimate Wiktionary is making a rush towards its first "outing". This results in all kinds of interesting things. It results in even more interest before we have anything to show for ourselves. It results in reasoned suggestion to change labels in preference to others; Label for Attribute because label has a meaning that is confusing to a large constituency for UW's community.

Wikidata is not the same as Ultimate Wiktionary and consequently has requirements of its own. It has language requirements of its own. It may need longer texts, it may require texts in a format that Ultimate Wiktionary frowns upon like capitalised expressions. As we are investigating the use of TBX for the static part of Ultimate Wiktionary, it made sense to think about TMX as well for this issue. This means that we need some basic stuff to deal with handling translation projects. I have come with this extension of Ultimate Wiktionary, this datadesign makes use of tables that are part of UW and may as a result become part of Mediawiki proper.

I realise that when we implement this, we have the core of a translation / localisation workflow. This makes sense when you consider that Wikipedia, one of the biggest websites of this world, exists in 212 different languages. When a Mediawiki message is changed, who is going to do the translation.. I doubt that there is one organisation that can do that well on a continuous basis. As I am a firm believer in using standards AND in eating my own dogfood, this is my first take on this issue.

Thanks,
GerardM

2 comments:

Anonymous said...

Hi,

It will be very difficult to map a TBX document to the structure shown at http://commons.wikimedia.org/wiki/Image:UW-translations.jpg

A TBX document has 2 parts: an XML file with the data and another XML file that describes the data format (XCS). The structure shown in the database diagram does not contemplate the multiple combinations that an XCS file can describe.

Get the TBX examples available at http://www.lisa.org/standards/tbx/samples/ and try to enhance the database schema to support all the samples.

FWIW, this is perfectly achievable.

Regards,
Rodolfo M. Raya

GerardM said...

The good news is that I will be helped by some notable people in achieving just this. They have the necessary expertice to do just this. They will not only help with the implementation but also do a feasibility study.. and yes they have the expertise.

Thanks,
GerardM