MyMemory: under the hood

Access the world's largest translation memory

MyMemory gives you quick access to a large number of translations originating from professional translators, LSPs, customers and multilingual web content. It uses a powerful matching algorithm to provide the best translations available for your source text. MyMemory currently contains 9,141,250,319 professionally translated segments.

Get relevant matches for your project

When you upload a new source document (.doc, .html, or .rtf), MyMemory searches its entire public archive and presents you with a customized memory (.tmx) containing the segments and terminology that best match your document.

Back up your memories in a safe place

Uploading your memories gives you a simple back-up solution. Memories are stored in a world-class safe data center. Stored segments may be exported and downloaded at any future time, with or without modifications made by other translators. You can also create new memories by merging and querying existing memories.

Protect your customer privacy by hiding proper names and brands

MyMemory lets you remove proper names or brands from uploaded memories with one click. In this way you can contribute memories without risk of distributing confidential information. For example:

Original sentence: IBM will hire new employees if we keep this information confidential.

Becomes: <PROTECTED> will hire new employees if we keep this information confidential.

Spidering the web for translations

To enhance the archive, MyMemory crawls the web for bilingual documents, evaluates their quality, classifies them by subject and aligns them. If someone else on the web has already translated a sentence, why do it again?

Improve your memories with Wiki-like contributions from other translators

If you choose to make your memories public, other translators will contribute by fixing errors or improving the translation. You can ask to be notified of each change. You can download your memories with or without these contributions at any time.

Confidentiality: Public and Private Memories, no mass downloads, your identity protected

Only a memory's owner can download a memory in its entirety. Users may only download segments that match their current document. When you import a memory, YOU decide if it is public or private. Public memories can be searched and edited by the entire community (recommended) while private memories can only be searched and edited only by their owners. Additionally, you can choose to mark uploaded memories as 'anonymous' to protect your privacy.

Contribute to a better translation industry

By sharing your memories, you are making a significant contribution to the re-use, consistency, quality and productivity of our industry. You can help other people (positive) and they can help you (very positive :) ). Your contribution is important, and what you get in return is even more so!

Quickly search and update your memory segments

Search and update your uploaded segments with a single click.

Create memories by subject

All uploaded memories are automatically categorized by subject matter - segment by segment. At any time, you can easily download all your segments for a given subject. You are no longer constrained to accessing your memories eg/ by customer.

Better matching algorithm

Generic, merged memories tend to suffer from the vice of providing matches which are not relevant to specific projects ("over-information"). MyMemory has an advanced matching algorithm that finds highly relevant matches, by using human-like criteria such as context, segment origin, subject, memory ownership and quality. Fewer suggested segments of higher relevancy means faster, better translation. MyMemory proposes best matches in a similar way to Trados, but then helps to complete partial matches by suggesting missing terms (auto-concordance).

Compatible with any CAT

MyMemory uses the TMX file fomat (Translation Memory Exchange). TMX is an open industry standard supported by Trados, SDLX, Wordfast, DejaVu, LingoTek and pratically every other commercially available CAT tool. Terminology is provided both within the TMX for concordance searches and in TBX and TXT formats for terminology management applications such as Multiterm.

Terminology extraction and its pre-translation

Researching terminology is difficult and time consuming, but making the effort to pre-translate terminology before embarking on a translation project can reduce delivery times and improve translation quality. MyMemory is able to extract and pre-translate terminology from your memories and monolingual documents to provide you with the information you need quickly.

Fixing partial matches with statistical machine translation (Beta)

A big innovation in MyMemory is the transformation of partial matches into 'nearly full' matches by identifying missing parts, translating them and using a language model decoder to re-order and disambiguate terms. You can also ask MyMemory to machine translate non-matches when you download a memory for a project: this means your CAT tool will make a suggestion for every segment.

Here is an example of how partial matches can be improved. Suppose you are translating "I like red wine" into French, and your memory only contains a segment for "I like white wine":

	English (source)	French (target)	Comments
To translate	I like red wine	?	start translation
In the TM	I like white wine	J'aime le vin blanc	a 75% match was found
Difference Extraction	red	rouge	machine translated
Generated Match	I like red wine	J'aime le vin rouge	decoder creates a new 90% Match*

While this is still in beta release, it is already providing fixes for simple sentence with matches greater than 80%. Our objective is to be able to fix matches as low as 50%.

* A machine generated segment has a user-defined penalty: only a human-translated segment can match 100%.

How-To