Any product developer or original equipment manufacturer (OEM) can build a product initially for the local regional market. Once it has got the hype in the local market, then they can plan for its release in other regional markets per the requirements. Let’s say that the OEM is planning to launch the product in Europe, then it must support the following regional languages:
Bulgarian, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Greek, Hungarian, Irish, Italian, Latvian, Lithuanian, Maltese, Polish, Portuguese, Romanian, Slovak, Slovene, Spanish, and Swedish.
Problems in Multilingual Product Testing
In most cases, product functionality will be similar for all other regions but the language must be changed with respect to the regional language. Once the language migration is done, GUI-related validation for the same product needs to be accomplished. It is mandatory to repeat this for the remaining 23 regional languages.
Knowledge Adequate and More resource Cost
To validate the multilingual supporting product, the tester must have knowledge of all regional languages supported by the product. Hiring a tester who knows all regional languages is practically impossible. Again, hiring a different tester with knowledge of a different set of regional languages will double the resources cost.
Missing Minor Defects
Chinese and Japanese have more complex alphabets. If the developer misses any dot or line in the text, then the whole sentence will have a different meaning. So, these errors need to be captured without fail. In manual testing, such minor defects cannot always be identified by the tester.
Mismatching Sentence Footprint
In GUI development, all strings are maintained in a common string file and mapped to the text box ID. The GUI display screen contains different text boxes in located disparately in the display with respect to the GUI requirement. When the program starts executing the GUI, the processor reads the common string file and places strings in the ID location of respective text boxes. This common string file contains text IDs and its equivalent different language texts or sentences. Each text ID has its own foot size, so the text/sentence should fit into this. If not, the text will overlap with other texts, get hidden, or wrapped down. Such scenarios may occur in multi-language supporting devices.
Initially, the developer has to design a GUI layout for one language, let’s say English. For any other language migration, the developer will have to update the common string file with English equivalent other language text/sentence. Assume that the text display in English is ‘perform patient test’ and its equivalent Spanish text is ‘realizar la prueba del paciente.’ Although both have different font sizes, if the developer designs it for only English, then the text may overlap or get hidden. Other automation tools that run inside the device under testing (DUT) and read the GUI text by using text IDs will not be able to notice the hidden, overlapped, or wrapped text.
GUI Capture and its Poor Quality
In the embedded environment, running the automation tool inside the DUT is not possible due to memory and processor speed constraints. So, the automation tool must run in the host PC and communicate with the DUT. To validate the GUI content, the display image should capture and transfer it to the automation system where the automation tool is running. This can be done through HDMI or VGA display output but the hardware option should be available in the DUT. In most embedded devices, display output options are not available due to PCB size and cost limitations. In many cases, the option is available up to the development/debug stage and get eliminated during the production stage.
OEMs expectation in testing is “what there testing should go out”, so the final product should test the exact hardware and software and identify what they are planning for the production phase. In the testing stage, display capture can be done only with a camera. Verifying the display by using the camera image is another challenge OEMs encounter. In such cases, the quality may not be adequate to extract the GUI content.
eDAT Automation Solution
OEMs can overcome the abovementioned challenges by embracing an automation system. The eDAT automation framework, which supports OCR is one of the best automation systems available in the market. The eDAT automation framework has a DUT screen capturing feature through HDMI or camera and it requires the configuration which contains texts region information (like X, Y, height, and width) with respect to screens. eDAT OCR applies different image processing algorithms and filtering technics before extracting text from the image. This makes it capable of OCR.
Running a different image processing algorithm will take more computation time for each text region. But eDAT leverages its own optimization algorithm which takes only the computation time in the initial stage and finds processing algorithms for each text region that stores it. Similarly, it refers the same for other regions of the image. If any text region is unclear in the captured image, eDAT applies another processing algorithm until it is able to extract the proper text. All processing algorithms get accumulated in the storage list for future reference.
eDAT OCR extracts text across languages by using the above configuration and it gets compared with reference text. Here, a common string file can be used as a reference text file. Finally, the extracted text and the (text) image gets stored in the report for objective evidence.
eDAT OCR Roadmap
eDAT has a roadmap for OCR using the machine learning algorithm. This empowers the user to train the machine learning algorithm with their own text/symbols in the configuration stage. Training input may be in an image format and can contain a single symbol, word, or sentence.