A mysterious manuscript whose meaning has eluded linguists and cryptographers for centuries could be about to give up its secrets.
According to new research published in scientific journal PLOS One, the Voynich manuscript contains patterns and structures "compatible with those found in real language sequences".
The manuscript, named for Polish man Wilfrid Michael Voynich who purchased it in 1912, has been carbon dated to the early 15th century.
Its text has never been deciphered, as it bears no obvious similarity to any known language. Illustrations throughout the manuscript suggest it could be a guide to herbal medicine, but many of the plants don't match any known species.
"The text is unique, there are no similar works and all attempts to decode any possible message in the text have failed," Marcelo Montemurro, a theoretical physicist by day, told BBC News. "It's not easy to dismiss the manuscript as simple nonsensical gibberish, as it shows a significant [linguistic] structure."
Using modern statistical methods and computer analysis, Dr Montemurro says he has found "semantic networks" of related words within the text.
"There is substantial evidence that content-bearing words tend to occur in a clustered pattern, where they are required as part of the specific information being written," he says.
"Over long spans of texts, words leave a statistical signature about their use. When the topic shifts, other words are needed.
"The semantic networks we obtained clearly show that related words tend to share structure similarities. This also happens to a certain degree in real languages."
Dr Montemurro says the text can't be a medieval hoax because the such knowledge of language structure was not known at the time.
Its resistance to translation has led many to conclude the Voynich manuscript is a hoax, or an extremely tough cipher.
Skeptic Gordon Rugg, who demonstrated on William Shatner's television show Weird or What? how easy it is to generate convincing – but entirely fake – "Voynichese" text, says it's unlikely to be a cipher because it appears to have individual words.
If it was a word-based cipher concealing a message in a known language, it would have been cracked long ago.
"Some of the features of the manuscript's text, such as the way that it consists of separate words, are inconsistent with most methods of encoding text. Modern codes almost invariably avoid having separate words, as those would be an easy way to crack most coding systems."
Dr Rugg has suggested the manuscript – if a hoax – could be the work of medieval philosopher Roger Bacon or Elizabethan polymath John Dee (who was later immortalised by Blur singer Damon Albarn on his album Dr Dee: An English Opera).
But Dr Montemurro says the hoax hypothesis just isn't feasible, if it doesn't address the underlying statistical complexity of the text.
"While the mystery of origins and meaning of the text still remain to be solved, the accumulated evidence about organisation at different levels, limits severely the scope of the hoax hypothesis and suggests the presence of a genuine linguistic structure," Dr Montemurro writes in the study.