Automatic filtering of submissions

For some time now World Site Index has run an interactive filter on all new submissions, this filter checks the title and description for common problems such as keyword stuffing of the description, or duplicating the tile within the description.

For those that are interested the filter is written in PHP and is made freely available from our main site, just follow the link to the directory submission filter page.

As with any automated system it doesn’t get it right all the time, sometimes it is too strict other times too lenient.

Recently we disabled the filter for a couple of days to see what would happen to the quality of the submissions and if it was practical to remove the filter completely. The results were as we expected and very disappointing. The bulk of the submissions for the time the filter was disabled were deleted because almost no effort was made with them.

Some examples are below with company names removed:

“The I can’t be bothered submission”

Title: – Business and web directory
Description: – Business and web directory

Not much can be said about these.

“The we want to give you all our business submission”

Description: We offer custom web design and development services adhering to highest international standards at a very affordable price.

Yes, you did read it correctly as submission written in the 1st person will generate business for us, we should probably say a thank you rather than trying to make people write in the 3rd person.

We found that we were being contacted and asked to supply goods and services when listings carried a phrase such as “We offer”, while listings written in the 3rd person seldom if ever generated an enquiry to us. Therefore part of the filtering is to force people into writing in the 3rd person.

“The keyword stuffed submission”

Description: unsecured,personal loan unemployed,personal loan unemployed unsecured,student loan unemployed, student unemployed debt consolidation.personal loans,secured loans,unsecured loans.


Description: E-bookstore,e-books;art,history, poetry,guide travel books,nature,cooking and food,photography,music,health and medicine,psychology books

These really need no explanation as to why they are not acceptable.

Lesser infractions

  • Just about all submissions that aren’t out right keyword stuffing insisted on repeating the title at the start of the description, nothing a quick delete can’t fix unless there’s not enough left to make sense.
  • All capitals for the title and description.
  • The 1st letters of all words capitalised.
  • Insanely short descriptions some of just two words.

Will the filter stay?

While the filter isn’t perfect and does cause problems for some legitimate submissions, overall the quality of the submitted text is better with the filter enabled, therefore many more sites get accepted into the directory.

In a perfect world there would be enough time for us to rewrite every tile and description, but with the volume of free submissions we handle each day this is not an option, minor edits we can do a full rewrite sends the submission straight to the bin.

Until something better comes along or the majority of people start to submit sensibly, it seems there is no choice but to keep the filter in place or stop accepting submissions.

The filter stays!

