Geiriau am eiriau: Blog Uned Technolegau Iaith Canolfan Bedwyr, Prifysgol Bangor

Murmur

Geiriau am eiriau: blog Uned Technolegau Iaith Canolfan Bedwyr, Prifysgol Bangor

Offer Cymraeg ar gyfer OpenOffice.org fersiwn 3

Yn dilyn rhyddhau fersiwn 3 o OpenOffice.org roedd angen diweddaru ac ail-becynnu’r gwirydd sillafu Cymraeg sylfaenol a ddatblygwyd gennym nôl yn 2004 ar gyfer OpenOffice.org. Mae’n bleser gennym gyhoeddi bod y gwirydd diwygedig yn awr ar gael ar ffurf estyniad, ac mae modd ei lwytho i lawr a’i osod o

http://extensions.services.openoffice.org/project/lingucomponent-cy.

Cofiwch hefyd fod fersiwn Cymraeg o ryngwyneb fersiwn 3 OpenOffice.org ar gael, wedi’i leoleiddio gan criw Meddal.com. Ewch i http://ftp.linux.cz/pub/localization/OpenOffice.org/devel/OOO300/OOO300_m9/Build-2/ i lwytho lawr y fersiwn ar gyfer eich system weithredu chi h.y. Windows, Mac neu Linux)

Dysgwr disglair

Llongyfarchiadau i David Chan ar gyrraedd rownd derfynol Dysgwr y Flwyddyn Eisteddfod Genedlaethol 2007.

Mae David yn aelod o dîm yr Uned ac mae wrthi ar hyn o bryd yn cyfrannu ei ddoniau ieithyddol Cymraeg newydd (yn ogystal â’i ddoniau technegol helaeth) at y gwaith o ddatblygu Cysill a Cysgliad.

Mae David eisoes yn weithgar iawn gyda meddalwedd Cymraeg ac wedi bod yn allweddol i waith datblygu OpenOffice.org Cymraeg gyda Meddal.com ac Agored.

Welsh learner finalist

Congratulations to David Chan on becoming a finalist in the National Eisteddfod 2007 Learner of the Year competition.

David is a member of the Unit and is currently contributing his new Welsh linguistic skills (as well as his considerable technical ones) to the further development of Cysill and Cysgliad.

David has already been active in the Welsh language software scene by being key in realising Welsh OpenOffice.org with Meddal.com and Agored.

Sleidiau Technoleg I Genedl Fach

Dyma arbrofi hefyd gyda Slideshare.net :

Ni yw y byd

Mae’n syntheseiddydd testun-i-lais Cymraeg ni, sy’n gydnaws ag API lleferydd Microsoft, wedi tyfu’n sail i system destun-i-lais newydd i’r iaith Sinhala, a siaredir gan 16 miliwn o bobl yn Sri Lanka.

Dros y misoedd diwethaf rydym wedi dod i gysylltiad ag Asanka Wasala, sy’n gwneud ymchwil Sinhala yn Lab Ymchwil Technoleg Iaith Prifysgol Colombo, Sri Lanka. Gan ddefnyddio ein meddalwedd MSAPI, a gydag ychydig o’n cymorth ni, mae e bellach wedi datblygu syntheseiddydd testun-i-lais cyntaf yr iaith Sinhala, gan ddefnyddio fframwaith Festival. Hwn hefyd yw’r tro cyntaf i unrhyw un addasu ein MSAPI Cymraeg ni at iaith arall.

Mae’r system destun-i-lais Sinhala’n gweithio’n dda gyda darllenydd sgrin “Thunder”, sydd am ddim at ddefnydd personol. Mae hyn yn golygu y bydd siaradwyr Sinhala sy’n ddall neu’n wan eu golwg yn medru defnyddio darllenydd sgrin yn eu hiaith eu hunain am y tro cyntaf.

Mae Asanka Wasala wrth ei fodd â chanlyniad y cyd-weithio, gan ddweud: “Rydw i mor ddiolchgar i chi a’ch tîm am fod wedi creu a rhannu darn o feddalwedd mor ardderchog ac mor ddefnyddiol … Mae cymuned ddall Sri Lanka yn diolch i chi.”

The first step in our bid for world domination. . .

Our Microsoft Speech API-compliant version of the Welsh text-to-speech synthesiser has formed the basis for a new text-to-speech (TTS) system for the Sinhala language, spoken by 16 million people in Sri Lanka.

Over the past few months we have been in touch with a researcher on Sinhala, Asanka Wasala, who is based at the Language Technology Research Lab at the University of Colombo, Sri Lanka. Using our MSAPI software, and with some tips from us, he has now developed the first commercial-quality text-to-speech synthesiser for Sinhala, using the Festival framework. This is also the first adaptation of our Welsh MSAPI work to another language.

The Sinhalese TTS system works well with the “Thunder” screenreader, which is free for personal use. This means that for the first time, blind and visually-impaired Sinhala speakers will be able to use a screenreader in their own language.

Asanka Wasala is delighted at the positive results of our co-operation, writing: “I am so thankful to you and your team for creating and sharing such a useful, brilliant piece of software … Please accept the thanking of blind community of Sri Lanka.”

Bydd yn rhydd

Nodwedd orau meddalwedd rhydd, i nifer, yw ei gost. Mae’n fwy na thebyg y cewch chi’r meddalwedd am ddim. Os ydych chi am dalu am feddalwedd rhydd (am CD o’r peth, er enghraifft), fe fydd ffordd i chi gael fersiwn rhad ac am ddim ohono (fel arfer drwy lwytho’r peth i lawr o’r we, os oes gennych chi gysylltiad ddigon cyflym).

Ond mae i feddalwedd rhydd nodwedd arall hefyd, sef y rhyddid mae’n rhoi i unrhyw un i newid yr hyn yr ydych chi wedi’i greu, a’i addasu at ei dibenion nhw. Ambell waith, gall hyn weithio ar draws rhaglenni hyd yn oed.
Dyma i chi enghraifft o hynny, gydag ymddiheuriadau i’r rhai ohonoch sy’n gwybod am hyn eisoes: geiriadur Cymraeg er mwyn gwirio sillafu yn Firefox 2.0, wedi ei becynnu gan Thomas Thurman. Yr hyn wnaeth Thomas oedd cymryd y geiriadur wnaethon ni ei greu ar gyfer OpenOffice Cymraeg ac Agored, dadansoddi’r ffeil a’i ail-becynnu ar gyfer Firefox. Gwaith awr ginio, mae’n debyg. Ac mae’n gweithio’n wych.

Diolch Thomas, ac os oes unrhyw un arall am gymryd ein cynnyrch rhydd ni a’i addasu at eu dibenion nhw… wel, perffaith ryddid i chi wneud. Meddalwedd rhydd yw e, wedi’r cyfan.

Be free

The best thing about free software, for many people, is its cost. It’s more than likely that you’ll get it for nothing. If you need to pay for it (if you want it on CD, for instance), there’ll be a way for you to get a no-cost version (usually by downloading it, if your internet connection’s fast enough).

But free software has another benefit too: the freedom it gives people to change what you’ve created, and adapt it to their needs. Sometimes, this can even mean taking something that works in one piece of software, and making it work in another.

Here’s an example of just that, with apologies to those of you who’ve seen it already:  a Welsh dictionary for spellchecking in Firefox 2.0, packaged up by Thomas Thurman. Thomas took the dictionary we’d created for the Welsh OpenOffice project and for Agored, dissected the files and repackaged it for Firefox. A lunchtime’s work, according to him. And it works brilliantly.

Thanks Thomas, and if anyone else wants to take our free software and bend it into other shapes… well, you’re perfectly welcome to do so. It is free software, after all.

Workshop proposal: Free Software for speech and language technology for less-resourced languages

I’ve submitted a proposal for a ‘Special Session’ on speech and language technology for minority languages, to be held as part of a major international conference on speech technology, Interspeech 2007. The proposal is in the name of the SALTMIL Special Interest Group (’Speech and Language Technology for Minority Languages’, founded in 1999).

The title of the Special Session (if approved) would be ‘Free software for speech and language technology for less-resourced languages: sharing experiences and best practice’.

Here’s the relevant part of the proposal:

Free software for speech and language technology for less-resourced languages: sharing experiences and best practice

Speech and language technology researchers who work on less-resourced languages are often constrained to use free software, simply because of the severe lack of funding and software available. This software may be either open-source or closed-source, and in the latter case it may be a version of proprietary software that is made available for non-commercial use only. However, free software can be lacking in documentation and training, and can contain many undocumented features that impede progress. There is a need for researchers to come together and share experiences of using such software, including recommendations for getting the most out of it. This would include the following:

  • Examples of systems built using free software (possibly with demonstrations).
  • Presentations of bugs encountered, and strategies for dealing with them.
  • Presentations of additions and enhancements made to the software by a research group.
  • Descriptions of desired features for possible future implementation.

This kind of presentation is better done in a Special Session rather then scattered across several sessions (which is what one tends to find with papers on less-resourced languages). This is because it can be difficult to attend all papers that concern less-resourced languages when these are in several (often conflicting) sessions. Also, it can be difficult to ascertain from the abstracts alone whether a particular paper involves using or adapting free software. A Special Session dedicated to this specific topic would make it much easier for (often very isolated) researchers to learn of what has been done already with free software, thus avoiding the duplication of effort. It would also make it much easier for researchers to make contact with others who are already using the software that they plan to use.

There is no guarantee that a proposed Special Session will be approved by the Interspeech organisers, but this kind of workshop is surely needed, in some form or other. Those of us working closely with open-source software have come to realise it can have drawbacks. The financial cost may be zero, but there are other costs involved, which require time, effort, and significant expertise to overcome. We are committed to the use of open-source software, but this also means we need to find ways of sharing all the undocumented ‘folklore’ that is essential to actually using the software successfully. The hope is that this proposed ’special session’ will be one of those ways.