Document spinning is a term for generating variations of a source document. The goal is for variations to be different enough from one another that they will be seen as unique content. Spinning is usually accomplished with spintax, in which some words in a document are designated as choices between synonyms. Each spintax word has one of its synonyms randomly chosen when a document variation is spun. While spintax is useful for changes at the word and phrase level, this may not be enough to prevent variations from being detected as duplicate content. The limitation of spintax are explained in our article "Spintax Is Not Enough."
The solution for spintax’s shortcomings is to alter a document not just at the word level with spintax but also at the structural level with sentence and paragraph variants. Each variant maintains the approximate meaning of its sentence or paragraph, so variants work the same as synonyms in spintax. One is chosen at random from a set of variants. Spinning a document at both levels can produce variations with complex changes that go far beyond using spintax alone. This article presents techniques for writing variants.
There are several strategies for creating sentence variants. The grammar of sentences can be rearranged. Complex sentences can be divided or simple sentences combined. Inconsequential information can be added or subtracted. Sentences can be completely rewritten. Most of the methods are simple. With a bit of practice, they can be done almost by rote and with little thought, so making sentence variants can be done quickly.
Rearrangement of grammar is the easiest way to make variants. This is usually done by moving clauses. Several examples are shown below.
We hid in the mountains when the zombie apocalypse began. When the zombie apocalypse began, we hid in the mountains. If we hide then the zombies might leave. The zombies might leave if we hide.
While rearrangement can be used to easily make two variants for simple sentences, complex sentences with many clauses offer more opportunities for rearrangement so many variants can be created for one sentence. The following example demonstrates this.
As more people turned into zombies, the cities were abandoned and the government collapsed. As more people turned into zombies, the government collapsed and the cities were abandoned. The cities were abandoned and the government collapsed as more people turned into zombies. The government collapsed and the cities were abandoned as more people turned into zombies.
Other types of reordering may be understandable to readers but may sound odd in some contexts. The short example that follows shows this. The second variant sounds stilted.
We went to the hills. To the hills we went.
Breaking and Combining Sentences
Another way to make variants is to break compound sentences in two or combine short sentences into one compound sentence. The examples below illustrate this.
The infection spread quickly, and nearly everyone succumbed in weeks. The infection spread quickly. Nearly everyone succumbed in weeks. The cold was unpleasant but a benefit was that it made the zombies sluggish. The cold was unpleasant. A benefit was that it made the zombies sluggish.
Variants should convey the same meaning but an approximate meaning rather than an exact one is usually good enough. This opens the opportunity for adding or subtracting inconsequential information. The variants created this way will differ more than just rearranging clauses, so they are higher quality variants for the purposes not being seen as duplicate content.
Variants can easily be made this way by subtracting unneeded clauses. Unnecessary adjectives can also be removed but typically it is better to do that with spintax because not only can words be optionally removed but alternate synonyms can be used. In general, long sentences are made smaller by removing information. The reverse is true for adding inconsequential information.
The examples below demonstrate removing information.
The zombies, which reeked of decay, mulled about outside. The zombies mulled about outside. Through binoculars I watched the city burn. I watched the city burn.
The following examples demonstrate adding information.
The bite slowly changed him. The bite that seemed harmless at first slowly changed him. We saw the building from across the lake. We saw the building from across the lake where the population used to frolic.
Rewriting sentences is expressing the same point but in a completely different way, ideally with words that differ by more than just being synonyms of the original words. This takes more effort than rearranging sentence text and may not always be possible. Such variants greatly increase the uniqueness of spun variations and lessen the chance that variations will be detected as duplicate content. An example follows.
We went to the supermarket but the shelves were stripped bare. We tried to buy supplies but stores had already been emptied of everything.
An Example of Sentence Variants
The process of creating sentence variants using the above techniques is demonstrate below. The original sentence is shown first. The variants follow.
In late fall, just before Thanksgiving, a disease with initial flu-like symptoms began to quickly filter through the population. A disease with initial flu-like symptoms began to filter through the population just before Thanksgiving. A disease with initial flu-like symptoms began in late fall. It quickly filtered through the population. A disease began in late fall. It quickly filtered through the population. A disease filtered through the population. It is believed to have first appeared in late fall. Starting just before Thanksgiving, a disease with flu-like symptoms began to filter through the population. In late fall a disease began to filter through the population. A disease began to filter through the population in late fall.
This example shows how easily eight variants of the same sentence can be created. Typically making this many variants is overdoing it. If every sentence in a document has variants then any one sentence only needs a few variants to generate a large number of different variations. For example, eight sentences, each with two variants, has 256 possible combinations. If each sentence has three variants then there are 6561 possibilities.
Using paragraph variants allows a document’s structure to be changed more substantially than using only sentence variants. Paragraph variants can be different sizes, either by using sentences of different lengths or a different number of sentences, which will displace the text that follows. Points made by a paragraph can be reordered and ideas introduced can be rearranged; this can only take place at a level above the sentence scope. Ultimately, paragraphs can be rewritten to communicate the same information but do so with different words, reasoning, examples, or metaphors.
Number of Sentences
An easy way to make paragraph variants is to change the number of sentences. As sentences are removed, a paragraph will become simpler; and as sentences are added, a paragraph will become more complicated. This can be considered the paragraph version of removing or adding inconsequential information in sentences.
Removing sentences is a simple process of omitting sentences or combining two long sentences into one shorter sentence. For example, consider the paragraph with six sentences below.
When the sickness began, no one connected their condition with the alarm of a week earlier. November is flu season, and it seemed like the usual fall bug. Most victims experienced a hacking cough and persistent nasal drain at first. A few days later came mild headaches. Then came head pain so agonizing that the infected screamed intermittently. Within two weeks people began streaming into hospitals and clinics.
It can be turned into a three sentence paragraph like the one that follows.
The sickness began in November, flu season, and it seemed like the usual fall bug. The infected experienced a hacking cough that soon changed to mild headaches. Within two weeks people, many with head pain so agonizing they screamed intermittently, began streaming into hospitals and clinics.
Adding sentences is a reverse of the example above. A paragraph is expanded by adding extra detail that does not substantially change its meaning.
Order of Ideas
Paragraphs often introduce points that are independent of each other so they can be introduced in any order. This presents an opportunity for restructuring a paragraph. For example, consider the short paragraph below.
The house was not safe from zombies. The door was flimsy. The windows were large. The walls were thin. We decided to move.
The middle three sentences could be put in any order, which would give six possible paragraph structures.
Rewriting paragraphs completely allows variants to differ in ways beyond rearranging text and subtracting or adding sentences. The reasoning of a paragraph can be changed. For example, a paragraph that uses an extended metaphor to illustrate a point could be written to use a different metaphor. Rewriting allows the same point to be made in a way that is unrelated to other variants. What is more, a variant can be rewritten to use many words that are not in common with the original paragraph. If there is little more than a general idea shared between variants then there can be little shared text, which is what is used to detect duplicate content.
Take the two paragraphs below for example. The second is a rewrite of the first. Both express how an infection spread through a city. There are a few details and words shared between the two, which is usually necessary when the same point is being made, but the majority of the two texts are very different.
When the sickness began, no one connected their condition with the alarm of a week earlier. November is flu season, and it seemed like the usual fall bug. Most victims experienced a hacking cough and persistent nasal drain at first. A few days later came mild headaches. Then came head pain so agonizing that the infected screamed intermittently. Within two weeks people began streaming into hospitals and clinics. The infection did not seem extraordinary as it spread through the city. It was November, flu season. The initial symptoms were mild but slowly escalated day by day. By the end of the first week, many people stopped going to work. The city ground to a halt as those with agonizing headaches overwhelmed the hospitals.
If paragraphs each make a point and the points are independent of each other then the paragraphs can be randomly reordered. For example, consider a document that has an introductory paragraph, three supporting paragraphs explaining independent points, and a concluding paragraph. The three middle paragraphs could be randomly reordered. There will be six possible document structures.
Duplicate Content Detection
Duplicate detection can be thought of as plagiarism detection, and plagiarism detection algorithms generally work by finding sections of text that documents have in common. A sophisticated algorithm may not be fooled by changing words with spintax because a reverse thesaurus can be used to convert words to canonical synonyms, which will eliminate or reduce the effect of spintax. For example, all occurrences of "car," "auto," and "automobile" could be converted to "car" before finding common document sections, so even if spintax was used to randomly change instances of "car" to one of the three synonyms then it would have no effect. The use of sentence and paragraph variants makes detection of duplicates more difficult but there are still concerns.
Avoiding detection basically boils down to minimizing the amount of text that variations have in common with each other. Some methods for making variations achieve this better than others. Simple rearrangements move sections of text but the sections are still the same. Even though the sections may be short, if there are enough of them in common between variations then the variations might still be seen as duplicate content. What this means is that the more text is changed in a variant, the more effective it will be at being seen as unique content. Rewriting works better than rearrangement.