The Devil is in the Details: Why You Need to Start Using a Corpus to Localize and Translate Better
This post was originally published on . It is reposted with permission.
Do you know what a CORPUS is? Have you ever used a corpus before in your translation career? Have you ever felt stuck with a simple phrase in your source language and can鈥檛 find an equivalent that runs as smoothly as your original? If you answer no to the first two questions, but yes to the third, read on!
When your dictionaries, glossaries, google searches, and other terminological tools can鈥檛 help you get a seamlessly translated phrase, and when your brains are fogged up, you may end up with an accurate, but clunky phrase that does a disservice to your client. Whether you鈥檙e localizing a product or translating anything from a report to a slogan, these clunky phrases, in the best-case scenario, can lead to a clunky reading experience for your target audience, and in the worst case, they can alienize the page for the end user or reader and turn them away. The results for your client can be catastrophic. As many localization, translation and engineering experts know, the devil is in the details.
Corpus can help in these cases.
Let鈥檚 dive in.
A corpus is a large collection of text, written, spoken or both, stored in a database. It can contain many millions or even billions of words, coming from books, newspapers, magazines, journals or works of literature that have been scanned or downloaded electronically. Some corpuses may also contain spoken language coming from transcripts of ordinary conversations, like phone calls, business meetings, conferences, parliamentary meetings, or even radio broadcasts and TV shows.
Corpuses show how language is used in society, in real life. When you translate, you鈥檙e creating a real-life text that will live in a new language, in a new society. You need to reflect how that society speaks, to speak in their language and reach out to them, for your client. If you don鈥檛, your business may suffer.
With a corpus, we no longer have to rely so heavily on intuition to know whether a particular adjective goes well鈥攐r to use the technical word, collocates鈥攚ith a certain noun, or whether a word is usually used in a particular context. Instead, we can see what hundreds of different speakers and writers have actually said or written before.
Let鈥檚 see a sentence that I was struggling with a few months ago when translating a social justice article from English into Spanish for the Hispanic community in the US:
鈥淭he ability of qualified election officials to conduct legitimate audits of their own has gained urgency with partisan actors fueled by the Big Lie conducting partisan reviews that spread false information about elections and undermine confidence in our democracy.鈥
This sentence is structurally pretty complex in English, and you鈥檒l need a fairly good knowledge of syntax to break up its parts and piece them all together in your target language. I鈥檓 not going into any of that now, but you can see . I just want to concentrate on the phrase 鈥渉as gained urgency.鈥 This is a good collocation in English. The verb 鈥済ain鈥 collocates with the noun 鈥渦rgency鈥. If we read instead 鈥渉as taken urgency鈥, it will probably make us pause, hesitate. If this were a slogan, it would be a big flaw. The same happens in your target language. Not any verb will collocate with the translation of the noun 鈥渦rgency.鈥 In my target language, Spanish, things get a bit more complicated because we鈥檙e so used to reading false friends, calques and literal translations that it鈥檚 hard sometimes to separate the wheat from the chaff鈥攖here are million examples of literal translations; a heavily broadcast recent example is Will Smith鈥檚 slapping remark calqued into Spanish, some of the examples are , , and .
So in this example, if we go the literal route, we could say 鈥済anar urgencia,鈥 but does this ring naturally? No. So let鈥檚 do what translators do a lot of: find synonyms.
Ganar is synonymous with:
-
lograr
-
adquirir
-
补诲耻别帽补谤蝉别
-
triunfar
-
vencer
-
aventajar
-
exceder
-
sobrepujar
-
superar
-
dominar
-
conquistar
-
tomar
-
cobrar
-
alcanzar
-
llegar
-
captar
-
granjear
-
atraer
-
prosperar
-
mejorar
At this point, we鈥檙e relying on our intuitive knowledge of our target language to decide which one goes well or collocates with urgencia. But what if we can confirm our hunches with a corpus? So of all these synonyms, I鈥檝e narrowed down my options to 鈥済anar urgencia,鈥 鈥渁dquirir urgencia,鈥 鈥渢omar urgencia,鈥 and 鈥渃obrar urgencia.鈥 But which one to choose? Corpuses can help us in a way no other tool can.
This is one example of the many corpuses out there. See the reference list at the end for more examples.

This is a Spanish corpus called CORPES, a corpus of the 21st Century Spanish, containing texts from 2001 to 2020.
So let鈥檚 try our narrowed down options, 鈥済anar urgencia,鈥 鈥渁dquirir urgencia,鈥 鈥渢omar urgencia鈥 and 鈥渃obrar urgencia.鈥 Watch the search live in the video below. I invite you to learn a new free tool to translate and localize better in six minutes.
References
-
Bilingual/Multilingual: Open-Source Parallel Corpus, OPUS:
-
Spanish: CORDE, diachronic, from the beginning to 1974:
-
CREA, contemporary, collecting spoken and written texts from 1974 to 2004: .
-
CORPES (beta version), written texts from 2001 to 2020: .
-
English: Some are the American Contemporary English Corpus, or COCA, the Corpus of Historical American English, COHA, the News on the Web NOW Corpus, the TV Corpus. Usefully collected in (free but registration required).
About the Author
Ana Lis Salotti is an English>Spanish translator and educator based in Berkeley, California. Originally from Argentina, she has over 16 years of experience in the industry, and holds a master’s degree in Translation and Interpreting from the University of New South Wales, Sydney,听Australia. She has always combined deep practical experience with theoretical grounding and a research-driven approach to translation. Her translation specializations include the environmental and conservation sciences, audiovisual translation, and the US Hispanic community issues. She has also designed and taught various translation courses at New York University, and the City College of New York’s Hunter College and John Jay College of Criminal Justice. She also offers some training through her own website.
Thank you
This is fascinating; thank you very much!