return Home

How a SF startup helped a Japanese publisher overcome its ‘text dilemma’

A Silicon Valley startup solved one of the biggest headaches for a top publishing company in Japan: how to convert Japanese printed text, written vertically, into digitally compatible html text to not only to use in digital magazines and apps but also to make it easier to translate for an international audience.

To explain how the solution came about, Akira Iwase of Shueisha Publishing and Stanley Chien of Kono spoke to FIPP contributor Felix Mago off stage at the recent Digital Innovators’ Summit 2017 in Berlin. 

***Join FIPP for our next event: the iconic FIPP World Congress, taking place from 9-11 October 2017 in London. Discounted pre-agenda bookings are available until 30 April, with savings of £800 or more on eventual rates. More at fippcongress.com***

Shueisha Publishing found it nigh on impossible to convert Japanese PDF content to digitally compatible content, explains Akira Iwase. “Most Japanese magazines’ text is printed vertically. This makes it very difficult to convert pdf text into html. This slowed down our digital development. For example, if we wanted to publish in html, we had to manually convert text to a standard photo, extract the data from the pdf and manually convert this into html. It took a lot of time and was just as expensive.”

Shueisha ()

A similar problem existed for translating printed PDF text, even though some of Shueisha Publishing’s magazines, like the manga comics ‘Naruto’ and ‘One Piece’ were sought-after in the US and Europe. Likewise several of Shueisha’s fashion magazines were in demand because Japan’s fashion is considered a market-leader in Asia and presented a lucrative opportunity for translation and syndication. 

As digital head of publishing this left Iwase with a major challenge in a market where the population is shrinking. Thankfully, Silicon Valley startup Kono came to his rescue. The company, founded in 2011 by Stanley Chien, started to develop automated technology to extract Japanese text from PDF to then be exported as html. Or in the words of Chien: “The technology we developed is called ‘Smarticle’... It extracts around 90 per cent of Japanese content out of PDF automatically using machine-learning algorithms.

“It can also identify subtitles and learn how to solve more complex language problems. This allows us to extract text (from PDF publications) and divide it into separate articles. After we have extracted it, we can reflow the content, so it's much easier to read on mobile devices. And it's automatically ‘html-ed’.”

Once this has taken place it is easier for automated translation into languages such as English and French to happen. 

“We can do even more interesting things with the extracted text… such as introducing artificial intelligence for recommendation engines, similar to Netflix, but for magazines. Based on what the user has read previously and their user profile we can feed them with articles they may be interested in. We can offer these recommendations in all Asian languages. So, Smarticle not only extracts the content for republishing on mobile devices, we can also provide data and analytics for personalised recommendations.”

In a world where interest in Japan and Asia is growing, this technology creates large opportunities for Asian publishers, says Chien. He references a paid for fashion magazine app in Apple’s App store - literally translated as ‘Japanese Magazine’ - which became extremely popular in China but was reportedly a pirated version of a Japanese magazine. According to Chien, it briefly became the best selling app before it was identified as fake and taken down by Apple.

“This proves that there's a large demand throughout Asia for Japanese content. So, I think there is a good opportunity for us to export the content. That's why we’re working with Shueisha and other Japanese publishers to translate some of their content so that more people in Asia can consume their magazines.”

Chien adds that this is a “golden opportunity”, giving them the chance to work with a spectrum of Japanese publishers using Kono’s technology to extract content, to then translate that content into multiple languages. “We at Kono and other publishers across Japan will benefit from it.”

More like this

Shueisha general manager: Print audience fuels ecommerce model

Here's what you missed at FIPP Asia-Pacific

Is native advertising about to ‘eat’ the Asia Pacific region?

US, China and Japan to drive recovery in luxury adspend

  • Bo Sacks: the man behind ‘Heard on the Web’ For some, Bo Sacks is their guru, consulting on how to affect change in their newsroom. For others, Bo Sacks speaking out may leave them hot under the collar. And for many in the industry, Bo Sacks is an old friend, his newsletter a welcome sight in their inbox every day. 28th Apr 2017 MagWorld
  • How Jeff Bezos’ insistence on experimentation became part of The Washington Post's regeneration When Amazon CEO Jeff Bezos bought The Washington Post for US$250 million late in 2013 he challenged the staff of the declining legacy newspaper to push the envelope in terms of experimentation with digital storytelling. What followed was not only remarkable growth but also “the opportunity to invest heavily in staff and technology”. 28th Apr 2017 MagWorld
  • Mobile is the status quo; voice will be the point of interaction Publishers should stop talking about “mobile or mobile first”, says Oliver von Wersch, digital consultant for G+J in Germany, because today everything starts with mobile. Moreover, the next big transformation – voice user interfaces – is already on its way… and is “a big one” for publishers to focus on. 27th Apr 2017 MagWorld
  • How The Times drives habitual digital use with an editions-based approach It’s just over a year since British newspaper The Times abandoned the online breaking news cycle and reverted to three digital deadline-driven editions a day. At the time, industry insiders frowned on the move but now the Murdoch-owned paper is claiming wholesale success. We ask why the ‘editions approach’ to digital publishing seems to be working for The Times and how this could apply to others?   26th Apr 2017 MagWorld


Visit our Youtube channel


In this article


FIPP newsletters allow you to keep up with industry trends, research, training and events across the world



Get global coverage of your launches, company news and innovations


Upcoming @ FIPP

What’s happening now, what’s coming next

Go to Full Site