Thursday, January 29, 2009

Data Mining: Solid Gold Information


Data mining is the practice of collecting and storing information in humongous data bases. The information is gathered from opt ins, on-line surveys, forms and other “voluntary” means of collecting information, usually from customers and buyers and about customers and buyers.

 

Who uses data mining? Retail outlets, insurance companies, banks, airlines and other industries that not only collect data, but derive benefit from analyzing that data in a scientific, systematic manner to improve service and profit margins. And if it works for the big guys, it’ll work for you (only on a slightly smaller scale).

 

What are we looking for?

Using data mining technology, industries are looking for trends before they become trends. Relationships between customer A and widget B. Patterns of activity, unusual events – the list is endless and growing all the time.

 

The fact is, billions and billions of pages are stored on computers and billions of those billions of pages are available through any search engine. And while this information can help your on-line activities in a general way, data mining your own historical repository of data will reveal useful information about activities closer to home – yours!

 

If you’ve been in business on-line for any length of time, even a couple of years, you’re sitting on solid gold marketing data. Your database of customers and what they bought, where they live and how they pay. And you can use an analysis of this information to improve the performance of your web site.

 

How can it help me?

Probably the most useful way data mining will help small- to mid-sized site owners is by defining the target demographic – the characteristics of most buyers. Men or women? Age? Zip code? Income bracket? Using data harvesting and analytic software, you’ll quickly be able to develop a picture of that perfect buyer – the one who buys the most, most often.

 

This information equips you to develop marketing campaigns targeted specifically at your key demographic. If you’re selling knitting supplies, using a Harley-Davidson as the centerpiece of your e-mail campaign probably won’t pull as much as a nice picture of a kitten playing with a ball of yarn. Data harvesting enables site owners (and huge media and retail conglomerates) to target their marketing with pinpoint precision. (You don’t think those Gap ads were created by accident, do you?)

 

Interactive Marketing

Of most importance to on-line business owners, interactive marketing appeals to visitors to your web site. What can visitors do? Where can they go? What can they learn? And see?

 

By analyzing harvested data, you can track the movements of site visitors to determine which features draw attention and which are just taking up space. “Google Analytics” will even perform the analysis for you, indicating in GUI form which site pages attract attention and which are quickly passed over.

 

For on-line retailers, this kind of analysis defines your most valuable digital real estate and, obviously, this is where you’d place your most popular or profitable products, announcements of upcoming sales and other “targeted” information.

 

Is it working?

It would be nice to know if your Adsense program was pulling better than your banners placed on a dozen different sites. Data harvesting will give you the answer quickly once you establish a baseline.

 

The baseline is what’s happening now – the status quo. With an established baseline, you have a yardstick by which to measure whether your PPC program should get more dollars while your click-through rate on banners isn’t worth the money you’re spending.

 

Is it bogus?

Large, on-line (and real world) retailers use data harvesting to better detect fraudulent activity. For example, MasterCard will quickly contact cardholders in whose accounts unusual activity has occurred. For example, using data harvesting, the credit card company knows you’ve never made a purchase of anything in Taiwan. Then, in a matter of two hours, 23 transactions from Taiwan all show up on your card. Now that’s called an anomaly – something out of the ordinary.

 

The MasterCard program continues with follow through. The cardholder of the account in question is likely to get a call from a MasterCard representative to see if, indeed, you did purchase 23 racing bikes in Taiwan within the past 24 hours. If not, they can often void the transaction before it actually takes place.

 

Will it make my customers happier?

Much. You’ll be ahead of the curve on spotting trends so you’ll have the latest when visitors come to shop. You’ll be able to better predict seasonal buying patterns for your particular goods or services. You’ll be able to improve warehousing, order handling, inventory management and more – even if the inventory is stored in a spare bedroom.

 

Where do I get this wonderful tool?

You’ve got the data – or at least you should have it, if you’ve been in business for a while. That customer data just needs to be analyzed to better equip you to refine your site, better target your ideal buyer, identify trends ahead of the competition, better identify fraud and deliver the precise product at the precise time to the exact right buyer. Metrics and analytical software, like Google Analytics, will help crunch the raw data into meaningful results.

 

If you haven’t started using the information you have on your hard drive, you’re wasting some of the best information you’ll ever have concerning the success of your business.

 

Use it or lose it. 

Tuesday, January 27, 2009

Web 5.0: What Will the W3 Look Like In 10 Years?

Predictions are easy. Especially when they’re 10 years out and no one remembers you stuck your neck out back in 2009.

 

When I was kid, I was told that in the future, I’d have my own hovercraft. Traffic jams a thing of the past. They also told us that nuclear power generation would enable us to disconnect our electric meters, power would be so cheap. Boy, did those prognosticators get it wrong as I open my $400 monthly oil bill.

 

Even though tellers of future events are wrong most of the time, even Nostradamus gets it wrong, Webwordslinger reads the tea leaves  and makes his utterly fearless predictions for what the web of our grandkids will look like.

 

Just look at that changes that have occurred in the 15 years since you became web smart.

 

What do you think the future holds for the web? We’d like to know, so please leave your opinions below. We’d love to hear from you.

 

1. In less than 10 years out, your TV and computer will blend seamlessly into one device. Watch TV on your computer. Click a link on the TV screen to get a sample of new Fab laundry detergent.

 

Further, we’ll see shows develop around viewer interactivity. No more reaching for the phone to try to get in your American Idol vote. Just click your fave and your done.

 

2. Miniaturization of computers will continue, especially as voice activation and recognition becomes more sophisticated. 10 years from now, you will use a device no bigger than the frame of a pair of eye glasses. Through voice commands (keyboards are sooo 2011) you’ll have complete access to your personal data and a web that’s in its fifth incarnation.

 

3. We’ll all be stars. Anybody with something to say will become a star when blogs, TV and solid information collide. You’ll be able to call up any number of thousands of video blogs on your TV set to learn everything from fly fishing to how to remove your own pancreas!

 

4. You’ll interact more with the TV and computer. See something you like, click your TV mouse and learn more from your drop-down, glasses sized computer – immediately. Consider how the transmission of information quickly will affect everything from your food choices to who you vote for.

 

Right now, TV and computers are taking baby steps toward integrating content from a variety of sources. Google, search engine par excellence, is now also a content provider with its acquisition of You Tube. And Microsoft is chasing Yahoo, threatening a hostile takeover. The reason?

 

Because these companies see the future and it doesn’t really involve them to the degree they’d prefer. Want to send an email to a friend? No need to log on. Grab the TV keyboard and send it via digital to your friend who will be able to access the text on his TV, eye-glass computer, ear PDA or text via cell phone.

 

Integration of technologies is a certainty for the future because there’s money to be made. Lots and lots of it and every content producer (TV, movies, newspapers, blogs, any form of content) will be under siege to produce more, better, faster.

 

5. Accessibility will increase. We’ve mentioned voice commands, but eyeball scanning will also be in place. Just look at the link for 2 seconds and you’re there. Think it. You’re there. This technology is already available in our sophisticated war machinery. It’s only a matter of time before it trickles down to the consumer level – like Velcro did.

 

6. Functionality skyrockets. We’re toddlers trying to synch up different platforms, languages, protocols and other digital details. But these are stumbling blocks, not brick walls.

 

We’ve seen huge growth in digital functionality in the past few years. Order your pizza on line, using your cell. And, if your cell is equipped with GPS, it’ll tell you how to get to the pizza place.

 

Utility and functionality will make us more productive. Also more reliant on digital communications.

 

7. In a digital world, an electromagnetic pulse knocks out the web. The web is a grid, and like dominoes, and EMP, properly placed could throw us all off line for months. Hey, welcome to the ‘70s – again.

 

So we can expect to see the web a more secure bastion – a necessary means of commerce. Just think about it – how would your business and your life be if the web disappeared?

 

More secure walls and rebounders are being developed (we ain’t there yet, folks) to offset the effects of a terrorist EMP.

 

And lets’ not forget hackers who will have many more access points to a site and to your information. These black hats aren’t going to mosey out of town. In fact, hacker tactics grow more sophisticated (read lethal) everyday. So, in 10 years, we’ll be padlocked with iron clad protection updated 10K a second.

 

Wanna bet? 

Monday, January 26, 2009

Bots or Bods: Sometimes It's a Tough Choice


Any well SEO-optimized site is going to make it easy to get spidered and indexed into the critical search engines fast – first time through. However, Googlebots and other crawlers have no brain, no soul, no conscience, they feel no pain and they never give up – ever! (Sounds a lot like The Terminator), not a bad comparison, actually.

 

Bots and spiders (same thing) see one thing. Humans see something different. And the features you develop and add to your site to appeal to human thoughts and emotions may totally confuse a spider, creating chaos back at search engine headquarters.

 

On the other hand, the clever site owner can use the mindlessness of letter-string gobbling bots to some advantage, delivering content visible to humans, but unreadable by bots. So there are pluses and minuses to any decision you make in the design of your site. Bots or bodies?

 

What Humans See Isn’t What Spiders See

When you visit a web site, you see the site skin, sometimes called the presentation layer. This is all “front of the curtain” stuff designed specifically to appeal to human visitors. Color and design motifs, placement of content, graphics elements and other stylistic considerations are all for human consumption. Bots wouldn’t know a good-looking site from one that’s uglier than a mud fence. And they don’t even care!

 

Instead, look at your HTML code. The boring sub-structure made up of meta data, line after line of code,
strange glyphs and indecipherable computer-speak. This is what bots see. This is what spiders spider. The underbelly of your drop-dead-gorgeous website. And, if you print out the HTML, XML and CSS programming in place, it’s not the least bit pleasing to the human eye. In fact, it’s black and white text. But spiders love it.

 

Using The Work Habits of an SEO Spider to Your Advantage

If you know what a spider knows about your site, you can take advantage of the crawler’s significant intellectual limitations.

 

Spiders crawl the underlying code of a web site. In doing so, they follow links. Their movements aren’t random. They’re directed. So how can you use this to your advantage?

 

Embed text links throughout the site body text – the text meant for human consumption. These embedded links should take spiders and humans to other relevant information on the site. Using these embedded text links ensures that spiders stick around longer, index more site pages and accurately assess the scope and value of your site to search engine users.

 

The down side of using embedded text links is that, if the content to which the spider is sent doesn’t synch up with the text in the link, the spider is easily confused. It may determine that you’re intentionally misdirecting site traffic and slam you. Thus, it’s important that embedded links actually lead to more expansive, albeit related, content. It’s equally important that spiders get the connection, something that can be handled on the coding side through the use of title tags and other individual page descriptors.

 

Another means of licitly exploiting the limitations of search engine spiders is to identify pages that should not be crawled. You don’t want spiders crawling the back office and posting your payroll records as part of the presentation layer so you put up a “KEEP OUT” sign on those pages you want left uncrawled and consequently unindexed within the search engine.

 

Now, spiders are programmed to be suspicious and too many ‘Keep Out’ signs will set off alarm bells. Unscrupulous site owners employ this tactic to hype a product or service that’s not exactly on-target with how spiders “see” things. Those obnoxious, long-form sales letters, for example, can be excluded from a spider’s view simply by informing the passing crawler that this page is off limits.  

 

Bots never see the site skin. They’re incapable of ‘reading’ or indexing graphic elements including pictures, charts, graphs, Flash animations and other non-text elements. Further, search engine bots won’t know that this body of text is associated with the picture next to it, i.e. product descriptions and product pictures. (Bots are utterly without nuance or guile. They’re more like sledgehammers than calculators.)

 

You can use this limitation of spiders to your advantage by uploading text in a graphics format that you don’t want spidered. For example, if you want to launch a trial balloon by testing a new product, you might not want that new product spidered until your market testing was complete.

 

No problem. Create the text in Flash and upload it as a Flash file. Any graphics format can be used – gif, jpg, etc. Visitors will see the text but spiders will just “know” there’s some kind of graphic in that location.

 

A couple of cautions here. First, even though bots can’t read graphics files, that doesn’t mean you can or should mis-lead site visitors with useless or deceitful information disguised as a graphic. These “black hat” tactics will eventually catch up with site owners who are less than straightforward with visitors and spiders.

 

Caution number two: Because bots can’t assess graphics, any text in a graphic format is invisible to spiders. That means that information critical to accurate search engine indexing may be unreadable by spiders. This can lead to a site that is only partially indexed, mis-classified within the index and, worst-case-scenario, banned from the search engine for perceived infractions. The only thing you’ll see coming through your site are tumbleweeds.

 

Structure your site skin for humans and your site code for spiders.

Functionally, that’s the difference between SEO (search engine optimization) and SEM (search engine marketing). But, to achieve online commercial success, you need to know your buyers, their likes and dislikes, and spiders and what they look for crawling through the code underlying your site.

 

Even though this may sound like a simple task, it isn’t always the case. For example, web frames are useful to humans in speeding up interactivity but frames can also corral spiders, trapping them and preventing them from reporting back to home base with your site data accurately in place.

 

In an ideal, digital world, spiders would be more intuitive, more refined in their search and indexing skills, and able to distinguish honest site owners who use the limitations of bots from the scammers who abuse these same limitations.

Saturday, January 24, 2009

Buy My Book. Get Out of That Work Rut - Like Now!

Five Negative Ranking Factors From the Googlistas



So what do the cyber-pros identify as the most negative ranking factors within Google’s current algorithm? They’re listed below but note, take these Google negatives with a grain of salt.

 

It could all change tonight while you sleep.

 

Negative Ranking Factor #1: Googlebots can’t access your server.

If the site is down for more than 48 hours, which is often the case with low-rent web hosts located half-way around the world, a site’s Google ranking drops like a stone.

 

If your host server is down a lot, search engines don’t want to recommend the site to visitors who will see a 404 error message that the site is unavailable and can’t be accessed.

 

The solution? Find a host that delivers not only a 99.9% uptime but also has local tech support, backup emergency generators and multiple layers of server side security. You’ll spend about $7.00 a month for quality shared hosting. Double that amount for quality dedicated service if cross-server attacks are a concern. Don’t let a few bucks a month keep your site from higher rankings. It’s just not cost effective.

 

Note: Server availability as a ranking factor is one of the most contended topics among SEO professionals who spend much of their time trying to out-think Googlebots, so even the experts can’t agree on this one.

 

Negative Ranking Factor #2: Duplicate or Similar Content.

Most experts do agree on this one.

 

Repetitious content is a stone-cold killer. Now, that doesn’t mean that you can’t pick up a useful piece of syndicated content of interest to your readers. The warning, here, has to do with site text. A programmer can always upload a syndicated article. However, body text should change from page to page, providing a more useful visitor experience.

 

Of course, duplicate content can be tagged with a designation, but too many of these “do not enter” signs is also a negative ranking factor. Bots want to be able to crawl pages and when you keep them off of critical content pages, it’ll have a negative impact on your SERPs ranking on Google.

 

Negative Ranking Factor #3: Links to low-quality sites.

SEO survey contributor, Lucas Ng, sums it up nicely: “Linking out to a low quality neighborhood flags you as a resident of the same neighborhood.”

 

It’s not just about links and plenty of them. It’s more about the quality of the links on a site. So, link up to sites in nice neighborhoods. On the web, Googlebots know you by the company you keep.

 

Negative Ranking Factor #4: Links Schemes and Links Selling.

Google’s algorithm employs probability modeling in determining bought-and-paid-for links, which doesn’t always equate to an accurate view of a site’s actual linking activity. Even so, Googlebots make assumptions programmed into the algorithm.

 

A site with a broad menu of links to diverse sites won’t fare well come spidering time. These links farms are easy for bots to spot. The key to avoiding being mis-indexed by Googlebots is to avoid too many links, try to link to higher-quality-more-visited sites and never buy or sell links. It could mean another web site fatality.

 

Negative Ranking Factor #5: Duplicate Title/Meta Tags.

Search engine algorithms employ numerous filters to identify everything from questionable links to duplicate content that appears on numerous site pages. The same thing is true of a site’s HTML code. Too many duplicate title tags and duplicate meta data can hurt you.

 

Survey participant, Aaron Wall, stated, “If a site does not have much content and has excessive duplication, it not only suppresses rankings, but it may also get many pages thrown in the supplemental results.”

 

Bots read code and if the same title tags show up on page after page, if title tags don’t match page text, or if meta data is cut and pasted into every site page, these crawlers take offense according to some experts.

 

However, there’s another whole school of thought, here. Many SEO pros and site designers believe just the opposite is true – that title tags on each page create numerous entry points to a site, and because each page is indexed separately, the site maintains a larger presence on SERPs. 

 

The key appears to be in the duplication of inserting repetitive title and meta tags. If the content doesn’t change on a particular page, that page doesn’t call for yet another title tag. However, when topics and functions do change from page to page within a site, title tags do help spiders identify the page’s purpose and do provide greater site access to potential visitors.

 

What NOT To Do With This Information

The wheels are spinning, aren’t they?

 

You and a million other site owners are weighing negative ranking factors and the impact these factors have on their SERPs position on Google.

 

Forget it. Let it go. The time you spend trying to reverse engineer your site to appeal to the perceptions of a collection of 31 SEO professionals would be better spent on search engine marketing – promoting to humans.

 

Oh, sure, you can migrate your site to a host with a much improved uptime and, in this case, you should regardless of what Googlebots like and dislike. You should migrate, not because bots will like you better, but because your customers will like you better when you’re there when they need you.

 

Same with cheesy links. Disconnect from garbage sites, links farms and any site that ranks lower than your site in page rank (PR). That’ll take five minutes of your time and it’s something you should do, again, forget the bots, do it for your site visitors seeking to further their web searches through links on your site. Help out site visitors because it’s just good business.

 

But, if you’ve got duplicate content on site, perhaps as RSS feeds, content syndication or hosted content, it seems counter-productive to remove this useful information from the site. Bots recognize these ephemeral links and their time-saving value to visitors by providing good content all in one place, even if it does appear on a few other sites.

 

There are a couple of lessons to be learned here. Lesson #1: Even really smart people who study the activities of Googlebots under controlled conditions can not agree, ultimately, what negative ranking factors are programmed into that passing Googlebot.

 

Lesson #2: (And the most important lesson du jour) Don’t try to outwit a Googlebot. Don’t rebuild your site to mitigate negative ranking factors. Take the obvious steps by going with a reliable host, cutting links to unattractive sites and so on, but don’t spend time reverse engineering your site based on the opinions of SEO pros.

 

Spend your time promoting your site to humans. Do it ethically. And over time, your site will receive an improved rank on Google’s SERPs – guaranteed.

 

Guaranteed? You betcha. “Length of time a site has been up” is one of the positive ranking factors. The longer you remain hooked into the web, the higher your Google ranking.

 

It’s just a matter of time.