Close
 


Question: As I mentioned in my introduction I'm a software devel

« Back
12»
Message Menu
Author Photo by: terrymiguel
Dec 02 2021, 9:40am CST ~ 2 years, 4 mos ago. 
Question: As I mentioned in my introduction I'm a software developer who has been building a Tagalog-English translation program using a neural machine translation toolkit. One of the things I have noticed in the training data over the past year is that Tagalog has many synonyms or alternatives for fairly common words, one of the alternatives often being a word of Spanish origin, e.g. pamahalaan/gobyerno. Are alternatives used arbitrarily, or are there implicit (social) rules governing the choice of one or the other?
Reply
 
Message Menu
Author Photo PinoyTaj Badge: Supporter
Dec 02 2021, 1:51pm CST ~ 2 years, 4 mos ago. 
Its mostly social based (formality, age of speaker, region etc) Sometimes the Tagalog forms of the words are only used in modules , news, (as a joke), or very scarcely by Modern Tagalogs.(However I would still learn the words as alot of the words people told me not to learn have actually been used by natives I talk to on a daily) Your social circle is everything really . If you hangout in Makati youll get Makati Tagalog If you hangout in Bulacan you will get that Tagalog .(lol)
 
There is also another side to this where some Spanish words are used by Filipinos (Usually in poetry) that aren't really used in the language Haha.
 
Message Menu
Author Photo terrymiguel
Dec 02 2021, 2:42pm CST ~ 2 years, 4 mos ago. 
@PinoyTaj Thanks. So if you were talking about the government you would probably use "gobyerno"?
 
Message Menu
Author Photo BoraMac Badge: Supporter
Dec 02 2021, 3:03pm CST ~ 2 years, 4 mos ago. 
Article XIV, Section 8 of the 1987 Constitution states that the Constitution
shall be promulgated in Filipino and English and shall be translated into major
regional languages, Arabic, and Spanish. The foregoing constitutional mandate is
in furtherance of the constitutional declaration under Section 7 of Article XIV that
the official languages of the Philippines are Filipino and, until otherwise provided
by law, English for purposes of communication and instruction.
 
legacy.senate.gov.ph /lisdata/2183818560! .pdf
 
What's the PURPOSE of the translation...formal or informal communication.
 
A ton of informal communication prefers spainish words fully adaapted to modern Filipino and to a lesser extent English words adapted to modern Filipino. How much of English is borrowed/expropriated :D from other langauges. Look at the percentages!
 
Have you considred code switching in modern conversation? How much time have you spent connecting in spoken Filipino rather than pushing Wayne's World data?
 
First thought off the top of my head...accumulate a look-up list of spanish-based Filipino paired with formal equivalents...same for English.
 
To my ear...the informal dominates...but I prefer modern casual conversations. You want to provide sermons...clearly different result.
 
Good luck...lub lub lub to see deeper reference tools beyond the ever expanding empire of jkos! :D
 
Message Menu
Author Photo PinoyTaj Badge: Supporter
Dec 02 2021, 3:15pm CST ~ 2 years, 4 mos ago. 
I use both but I’m not a native speaker. I use gobyerno most of the time because it is easier to pronounce.
PinoyTaj Thanks. So if you were talking about the government you would probably use "gobyerno"?
 
@terrymiguel
 
Message Menu
Author Photo kpagelimbo Badge: Native Tagalog SpeakerOfficial Tagalog.com Teacher Teacher
Dec 02 2021, 9:20pm CST ~ 2 years, 4 mos ago. 
I use and hear the word 'gobyerno' more often. 'Pamahalaan' is mostly used in textbooks, I think. Same with other borrowed words from other languages, we use them more often than their original Tagalog counterparts.
 
Message Menu
Author Photo jkos Badge: AdminBadge: SupporterBadge: Serious SupporterBadge: VIP Supporter
Dec 03 2021, 5:14am CST ~ 2 years, 4 mos ago. 
@terrymiguel
You might find the Tagalog.com Corpus useful.
www.tagalog.com/exam plefinder/
It will give you frequencies of different words to compare usage. You can also limit searches by content type…so, for example, you could search for pamahalaan vs gobyerno in the context of “Internet Comments” (an informal source, where goberyno is overwhelmingly the favorite) vs “All News Articles” (a more formal source, where govyerno and pamahalaan are used close to equally).
 
Message Menu
Author Photo terrymiguel
Dec 03 2021, 5:59am CST ~ 2 years, 4 mos ago. 
@jkos Thank you for that. I have just just done a quick search on the training set I used to write the neural MT Tagalog-English software and I find that "pamahalaan" has 8215 occurrences whereas "gobyerno" has only 1834. Based on comments here that suggests that my training set does not really represent everyday usage. Out of interest, my software generally chooses "pamahalaan" as the translation of government as does Microsoft Translator, whereas Google Translate tends to go for "gobyerno". I will certainly use the Corpus to do comparisons of common word pairs. As I am a Spanish speaker, the influence of Spanish in the Philippines was one of the reasons that drew me to Tagalog as the language to take for my NMT experiments.
 
Message Menu
Author Photo terrymiguel
Dec 03 2021, 6:00am CST ~ 2 years, 4 mos ago. 
@jkos Thank you for that. I have just just done a quick search on the training set I used to write the neural MT Tagalog-English software and I find that "pamahalaan" has 8215 occurrences whereas "gobyerno" has only 1834. Based on comments here that suggests that my training set does not really represent everyday usage. Out of interest, my software generally chooses "pamahalaan" as the translation of government as does Microsoft Translator, whereas Google Translate tends to go for "gobyerno". I will certainly use the Corpus to do comparisons of common word pairs. As I am a Spanish speaker, the influence of Spanish in the Philippines was one of the reasons that drew me to Tagalog as the language to take for my NMT experiments.
 
Message Menu
Author Photo Bituingmaykinang
Dec 03 2021, 8:46am CST ~ 2 years, 4 mos ago. 
Tagalog does have a lot of linguistic quirks that will make it difficult for AI machines to create more "natural sounding" sentences
 
Message Menu
Author Photo Bituingmaykinang
Dec 03 2021, 8:48am CST ~ 2 years, 4 mos ago. 
Double post
 
Message Menu
Author Photo terrymiguel
Dec 03 2021, 9:14am CST ~ 2 years, 4 mos ago. 
@Bituingmaykinang I know, that's why I saw it as a challenge :-)
 
Message Menu
Author Photo jkos Badge: AdminBadge: SupporterBadge: Serious SupporterBadge: VIP Supporter
Dec 04 2021, 9:59am CST ~ 2 years, 4 mos ago. 
@terrymiguel
Just curious…I would think this would be super difficult. Do you think you will be able to create something that is competitive with, say, Google Translate?
 
Message Menu
Author Photo terrymiguel
Dec 04 2021, 10:21am CST ~ 2 years, 4 mos ago. 
@jkos Thanks for your question. Using a technique called Byte Pair Encoding within the SentencePiece toolkit, the NMT model seems to be able to handle some of the complexities of Tagalog grammar although not everything is done perfectly. I did not want to use this forum to promote my work directly but there are Tagalog translation models of mine already available. There is a free online translation engine at www.nmtgateway.com and a low-cost Windows compatible commercial offline translation program at mydutchpal.com. The free online program represents an earlier stage of development work
 
Message Menu
Author Photo terrymiguel
Dec 06 2021, 1:18pm CST ~ 2 years, 4 mos ago. 
I'd just like to add that this is an ongoing project and the free online program now uses the same translation models as the paid version. The only difference is that the paid version can be used to translate MS-Office documents and not just screen input. It's far from perfect but I hope the free program will be a useful tool for learners.
 
Message Menu
Author Photo landaspantas
Dec 21 2021, 9:25am CST ~ 2 years, 3 mos ago. 
Newscasts usually use 'gobyerno' for foreign governments, and 'pamahalaan' for Philippine national government and its local governments.
 
"Gobyerno ng Tsina," "Gobyerno ng Amerika"
(Chinese government), (American government)
 
"Pamahalaan, nagbabala laban sa.."
([The national] government warns against..)
 
"Pamahalaang panlunsod ng Pasig, naglabas ng.."
(City government of Pasig issues..)
 
The academe favors 'pamahalaan' for both.
 
Some words have specialized meanings or are exclusive for particular senses.
 
'Pamilihan' means market in general
'Merkado' means market in the figurative sense.
 
(There's also 'palengke,' 'tiangge,' and 'talipapa' but those are just subcategories of 'pamilihan')
 
"Presyo ng bigas sa merkado."
(The price of rice in the market.)
 
'Kasaysayan' means history and is and has always been preferred over 'historia' even in the Spanish era. 'Istorya' today almost always means like the English 'story.'
 
12»
Post a Reply»




« Back to Main Page
Views: 840