Wednesday, June 13, 2007

Introducing Blogger... in draft

| More

Blogger Buzz: Introducing Blogger... in draft

Today we released something for Blogger that we've long wanted to do - an experimental version of the site where the early adopters among you can try out new features before they're ready for full release. We're calling it Blogger in draft because the features are almost ready for publishing, but not quite.

The first feature available on Draft is Video Upload, accessible via a new button in the Post Editor. Check it out!


Expanding the webmaster central team

| More

Official Google Webmaster Central Blog: Expanding the webmaster central team

You've probably already figured this out if you use webmaster tools, the webmaster help center, or our webmaster discussion forum, but the webmaster central team is a fantastic group of people. You have seen some of them helping out in the discussion forums, and you may have met a few more at conferences, but there are lots of others behind the scenes who you don't see, working on expanding webmaster tools, writing content, and generally doing all they can for you, the webmaster. Even the team members you don't see are paying close attention to your feedback: reading our discussion forum, as well as blogs and message boards. We introduced you to a few of the team before SES NY and Danny Sullivan told you about a few Googler alternatives before SES Chicago. We also have several interns working with us right now, including Marcel, who seems to have been the hit of the party at SMX Advanced.

I am truly pleased to welcome a new addition to the team, although she'll be a familiar face to many of you already. Susan Moskwa is joining Jonathan Simon as a webmaster trends analyst! She's already started posting on the forums and is doing lots of work behind the scenes. Jonathan does a wonderful job answering your questions and investigating issues that come up and he and Susan will make a great team. Susan is a bit of a linguistic genius, so she'll also be helping out in some of the international forums, where Dublin Googlers have started reading and replying to your questions. Want to know more about Susan? You just never know what you find when you do a Google search.


[G] Update to our event on 6/14

| More

Official Google Checkout Blog: Update to our event on 6/14

eBay Live attendees have plenty of activities to keep them busy this week in Boston, and we did not want to detract from that activity. After speaking with officials at eBay, we at Google agreed that it was better for us not to feature this event during the eBay Live conference. Google is constantly reaching out to new users and sellers, and we are available to privately discuss any matters of concern with individuals as they relate to Google products. Interested parties may contact us at


Duplicate content summit at SMX Advanced

| More

Official Google Webmaster Central Blog: Duplicate content summit at SMX Advanced

Last week, I participated in the duplicate content summit at SMX Advanced. I couldn't resist the opportunity to show how Buffy is applicable to the everday Search marketing world, but mostly I was there to get input from you on the duplicate content issues you face and to brainstorm how search engines can help.

A few months ago, Adam wrote a great post on dealing with duplicate content. The most important things to know about duplicate content are:
  • Google wants to serve up unique results and does a great job of picking a version of your content to show if your sites includes duplication. If you don't want to worry about sorting through duplication on your site, you can let us worry about it instead.
  • Duplicate content doesn't cause your site to be penalized. If duplicate pages are detected, one version will be returned in the search results to ensure variety for searchers.
  • Duplicate content doesn't cause your site to be placed in the supplemental index. Duplication may indirectly influence this however, if links to your pages are split among the various versions, causing lower per-page PageRank.
At the summit at SMX Advanced, we asked what duplicate content issues were most worrisome. Those in the audience were concerned about scraper sites, syndication, and internal duplication. We discussed lots of potential solutions to these issues and we'll definitely consider these options along with others as we continue to evolve our toolset. Here's the list of some of the potential solutions we discussed so that those of you who couldn't attend can get in on the conversation.

Specifying the preferred version of a URL in the site's Sitemap file
One thing we discussed was the possibility of specifying the preferred version of a URL in a Sitemap file, with the suggestion that if we encountered multiple URLs that point to the same content, we could consolidate links to that page and could index the preferred version.

Providing a method for indicating parameters that should be stripped from a URL during indexing
We discussed providing this in either an interface such as webmaster tools on in the site's robots.txt file. For instance, if a URL contains sessions IDs, the webmaster could indicate the variable for the session ID, which would help search engines index the clean version of the URL and consolidate links to it. The audience leaned towards an addition in robots.txt for this.

Providing a way to authenticate ownership of content
This would provide search engines with extra information to help ensure we index the original version of an article, rather than a scraped or syndicated version. Note that we do a pretty good job of this now and not many people in the audience mentioned this to be a primary issue. However, the audience was interested in a way of authenticating content as an extra protection. Some suggested using the page with the earliest date, but creation dates aren't always reliable. Someone also suggested allowing site owners to register content, although that could raise issues as well, as non-savvy site owners wouldn't know to register content and someone else could take the content and register it instead. We currently rely on a number of factors such as the site's authority and the number of links to the page. If you syndicate content, we suggest that you ask the sites who are using your content to block their version with a robots.txt file as part of the syndication arrangement to help ensure your version is served in results.

Making a duplicate content report available for site owners
There was great support for the idea of a duplicate content report that would list pages within a site that search engines see as duplicate, as well as pages that are seen as duplicates of pages on other sites. In addition, we discussed the possibility of adding an alert system to this report so site owners could be notified via email or RSS of new duplication issues (particularly external duplication).

Working with blogging software and content management systems to address duplicate content issues
Some duplicate content issues within a site are due to how the software powering the site structures URLs. For instance, a blog may have the same content on the home page, a permalink page, a category page, and an archive page. We are definitely open to talking with software makers about the best way to provide easy solutions for content creators.

In addition to discussing potential solutions to duplicate content issues, the audience had a few questions.

Q: If I nofollow a substantial number of my internal links to reduce duplicate content issues, will this raise a red flag with the search engines?
The number of nofollow links on a site won't raise any red flags, but that is probably not the best method of blocking the search engines from crawling duplicate pages, as other sites may link to those pages. A better method may be to block pages you don't want crawled with a robots.txt file.

Q: Are the search engines continuing the Sitemaps alliance?
We launched in November of last year and have continued to meet regularly since then. In April, we added the ability for you to let us know about your Sitemap in your robots.txt file. We plan to continue to work together on initiatives such as this to make the lives of webmasters easier.

Q: Many pages on my site primarily consist of graphs. Although the graphs are different on each page, how can I ensure that search engines don't see these pages as duplicate since they don't read images?
To ensure that search engines see these pages as unique, include unique text on each page (for instance, a different title, caption, and description for each graph) and include unique alt text for each image. (For instance, rather than use alt="graph", use something like alt="graph that shows Willow's evil trending over time".

Q: I've syndicated my content to many affiliates and now some of those sites are ranking for this content rather than my site. What can I do?
If you've freely distributed your content, you may need to enhance and expand the content on your site to make it unique.

Q: As a searcher, I want to see duplicates in search results. Can you add this as an option?
We've found that most searchers prefer not to have duplicate results. The audience member in particular commented that she may not want to get information from one site and would like other choices, but for that case, other sites will likely not have identical information and therefore will show up in the results. Bear in mind that you can add the "&filter=0" parameter to the end of a Google web search URL to see additional results which might be similar.

I've brought back all the issues and potential solutions that we discussed at the summit back to my team and others within Google and we'll continue to work on providing the best search results and expanding our partnership with you, the webmaster. If you have additional thoughts, we'd love to hear about them!


Tuesday, June 12, 2007

Two updates to Blogger clients

| More

Blogger Buzz: Two updates to Blogger clients

By coincidence, a pair of stellar blogging clients have seen major updates in the past several days. If you like using Blogger but have browser-o'-phobia, one of these might be right for you:

The Windows Live Writer team has released Beta 2 of their software, which adds labels support along with the proverbial host of other features. Developer and friend of Blogger Joe Cheng covers the highlights and lowlights on his blog, and you can grab the beta for Windows Vista and XP over at the download page.

In the Macintosh corner, Red Sweater Software's MarsEdit has been updated to version 1.2, adding Blogger photo upload support via Picasa Web Albums. Developer Daniel Jalkut (also a friend of Blogger) describes the update on Red Sweater's blog. You can download a 30 day trial (Mac OS X 10.3.9 or higher) from the MarsEdit page.

For bonus additional Blogger goodness (for Mac users), grab the newly updated, newly working again Blogger Dashboard widget from Google's widget page. F12 + typing = blog post.

Working on a Blogger client of your own? Make sure you're hanging out in the Blogger Dev group to chat and keep in touch.


Custom Search on the fly

| More

Google Custom Search: Custom Search on the fly

Starting today, there's a new feature that makes Custom Search Engines (CSEs) even easier to create and keep up to date.

You can now create a CSE by simply placing a small piece of tailored code on a page on your site. With that one piece of code, Google's search technology will automatically include in your new CSE all of the sites you have linked to from that page, creating a dynamic, powerful and tailored search experience really quickly. Moreover, your new CSE will update itself periodically to include any new links added to that page.

So, if you have a blog or a directory-like site and don't feel like listing all of the URLs you want to search across, you can leave the work to us. With this new feature we'll automatically generate and update your CSE for you. For example, try the query 'sculpture' on this CSE dynamically created from a page of links to kids museums or the query 'planning' on the search engine about Artificial Intelligence we created from the page of links at Berkeley.

Pretty cool, eh? We think so too. There are many powerful things you can do with this new feature, and in the near future we'll be talking about different possibilities. In the meantime, however, feel free to get your dynamic Custom Search Engine up and running. We'll be back in an instant.

Keep the feedback and great ideas coming!


More ways for you to give us input

| More

Official Google Webmaster Central Blog: More ways for you to give us input

At Google, we are always working hard to provide searchers with the best possible results. We've found that our spam reporting form is a great way to get your input as we continue to improve our results. Some of you have asked for a way to report paid links as well.

Links are an important signal in our PageRank calculations, as they tend to indicate when someone has found a page useful. Links that are purchased are great for advertising and traffic purposes, but aren't useful for PageRank calculations. Buying or selling links to manipulate results and deceive search engines violates our guidelines.

Today, in response to your request, we're providing a paid links reporting form within Webmaster Tools. To use the form, simply log in and provide information on the sites buying and selling links for purposes of search engine manipulation. We'll review each report we get and use this feedback to improve our algorithms and improve our search results. in some cases we may also take individual action on sites.

If you are selling links for advertising purposes, there are many ways you can designate this, including:
  • Adding a rel="nofollow" attribute to the href tag
  • Redirecting the links to an intermediate page that is blocked from search engines with a robots.txt file
We value your input and look forward to continuing to improve our great partnership with you.


[G] Save 5% on gifts for dads and grads

| More

Official Google Checkout Blog: Save 5% on gifts for dads and grads

Father's Day is less than a week away, and graduation season is in full swing. To help you find gifts for your dads and grads, and are offering 5% off Google Checkout orders until Sunday, June 17th (that's Father's Day). Just enter coupon code 'dad' after you click on the Google Checkout button.


You Asked For it, You Got It. New Features Added to Google Analytics

| More

You Asked For it, You Got It. New Features Added to Google Analytics

Since releasing the new Google Analytics, we've received repeated requests for specific features. We felt strongly that these features deserved to be incorporated into the product immediately so, as of today, everyone has access to the most requested improvements. We are also removing the beta tag from the new interface and we'd like to remind everyone that the previous version of Google Analytics will be removed on July 18th.

What improvements can you expect to see? Here are the most prominent changes:

Hourly Reporting
Many of you listed hourly reporting as the most important feature missing in the new interface. We've put it back. Several of the reports now have a "View by: Daily/Hourly" switch that allows you to select whether you want to see your data by day or by hour. Andy Beal, our friends at Yelp, and thousands of others can rejoice in their hourly window to the world.

Clickable URLs
Danny Sullivan and a legion of others will be thrilled to see that we've added the ability to click straight through to external pages from links referenced in reports. Just click on the icon next to any link listed in the Referring Sites, Top Content, Top Landing Pages, and Top Exit Pages reports.

Cross Segmentation by Network Location
Many of you expressed disappointment that you couldn't cross segment reports by Network Location. Now you can.

Increased number of data rows per page
The interface now allows you to view up to 500 rows of data on a single report page, increased from a maximum of 100 rows.

Bounce Rate increase/decrease
Our always observant resident analytics evangelist, Avinash Kaushik, pointed out that an increase in bounce rate (not desirable) was displayed in green and a decrease in bounce rate (a desirable result) was displayed in red. He was right, so we flipped the colors. Bounce rate increases are now displayed in red and bounce rate decreases are displayed in green. Ah, much better.

AdWords Integration
It's now much easier to link your AdWords account to your Google Analytics account, so if you haven't done it yet, now is the time.

If you look very carefully, you may also notice the following changes:

- Google Analytics now recognizes the following search engines:,, and
- Reports that are newly added to the dashboard now have their data linked to their more detailed versions.
- We've added Help resources to the Email Reports interface.
- Several countries have been added to the list menu in Step 2 of the Account Activation process. The list is now consistent with available country choices in AdWords.

Finally, our release notes are posted in our Help Center if you would like to keep up to date on future Google Analytics changes. We hope you enjoy the new features.


Monday, June 11, 2007

Today's outage

| More

Today's outage

This afternoon we experienced a brief outage, during which about half our users seemed to lose their subscriptions. This can happen when one of the many complex systems that power Google Reader experiences a glitch. We work hard to avoid problems of any kind, but occasionally something like this happens. The good news is that no data was actually lost, it was just temporarily inaccessible. Google's systems store data redundantly to minimize the chance of anything becoming permanently lost.

We were able to identify, diagnose, and fix today's outage within an hour, which is the kind of response time that we strive for. We'll continue give quick status updates to problems like this in the future so users who have trusted us with their data can feel comfortable doing so.


Thwarting a large-scale phishing attack

| More

Google Online Security Blog: Thwarting a large-scale phishing attack

In addition to targeting malware, we're interested in combating phishing, a social engineering attack where criminals attempt to lure unsuspecting web surfers into logging into a fake website that looks like a real website, such as eBay, E-gold or an online bank. Following a successful attack, phishers can steal money out of the victims' accounts or take their identities. To protect our users against phishing, we publish a blacklist of known phishing sites. This blacklist is the basis for the anti-phishing features in the latest versions of Firefox and Google Desktop. Although blacklists are necessarily a step behind as phishers move their phishing pages around, blacklists have proved to be reasonably effective.

Not all phishing attacks target sites with obvious financial value. Beginning in mid-March, we detected a five-fold increase in overall phishing page views. It turned out that the phishing pages generating 95% of the new phishing traffic targeted MySpace, the popular social networking site. While a MySpace account does not have any intrinsic monetary value, phishers had come up with ways to monetize this attack. We observed hijacked accounts being used to spread bulletin board spam for some advertising revenue. According to this interview with a phisher, phishers also logged in to the email accounts of the profile owners to harvest financial account information. In any case, phishing MySpace became profitable enough (more than phishing more traditional targets) that many of the active phishers began targeting it.

Interestingly, the attack vector for this new attack appeared to be MySpace itself, rather than the usual email spam. To observe the phishers' actions, we fed them the login information for a dummy MySpace account. We saw that when phishers compromised a MySpace account, they added links to their phishing page on the stolen profile, which would in turn result in additional users getting compromised. Using a quirk of the CSS supported in MySpace profiles, the phishers injected these links invisibly as see-through images covering compromised profiles. Clicking anywhere on an infected profile, including on links that appeared normal, redirected the user to a phishing page. Here's a sample of some CSS code injected into the "About Me" section of an affected profile:

<a style="text-decoration:none;position:
absolute;top:1px;left:1px;" href=""><img
style="border-width:0px;width:1200px; height:650px;"

In addition to contributing to the viral growth of the phishing attack, linking directly off of real MySpace content added to the appearance of legitimacy of these phishing pages. In fact, we received thousands of complaints from confused users along the lines of "Why won't it let any of my friends look at my pictures?" regarding our warnings on these phishing pages, suggesting that even an explicit warning was not enough to protect many users. The effectiveness of the attack and the increasing sophistication of the phishing pages, some of which were hosted on botnets and were near perfect duplications of MySpace's login page, meant that we needed to switch tactics to combat this new threat.

In late March, we reached out to MySpace to see what we could do to help. We provided lists of the top phishing sites and our anti-phishing blacklist to MySpace so that they could disable compromised accounts with links to those sites. Unfortunately, many of the blocked users did not remove the phishing links when they reactivated their accounts, so the attacks continued to spread. On April 19, MySpace updated their server software so that they could disable bad links in users' profiles without requiring any user action or altering any other profile content. Overnight, overall phishing traffic dropped by a factor of five back to the levels observed in early March. While MySpace phishing continues at much lower volumes, phishers are beginning to move on to new targets.

Things you can do to help end phishing and Internet fraud
  • Learn to recognize and avoid phishing. The Anti-Phishing Working Group has a good list of recommendations.

  • Update your software regularly and run an anti-virus program. If a cyber-criminal gains control of your computer through a virus or a software security flaw, he doesn't need to resort to phishing to steal your information.

  • Use different passwords on different sites and change them periodically. Phishers routinely try to log in to high-value targets, like online banking sites, with the passwords they steal for lower-value sites, like webmail and social networking services.


[G] Desktop Gadgets at Developer Day

| More

Inside Google Desktop: Desktop Gadgets at Developer Day

On May 31st Google hosted Developer Day events all around the world. The Google Desktop team gave two presentations: one in Mountain View, and another in Tokyo. Mihai gave the Mountain View talk, and James gave the Tokyo talk. The links lead to the YouTube videos, and are great resources for learning about the full potential of the Google Desktop APIs. Here is Mihai's presentation:

Many Google Desktop team members staffed booths and showed off the potential of Google Desktop gadgets. Developers were amazed at how easy it is to do powerful things from gadgets. We would start with a blank desktop and then hit shift-shift to bring up a slew of different and interesting gadgets. Everyone loved this. In particular, there were two gadgets that really piqued user and developer interest, because these gadgets do complicated things with small amounts of easy-to-understand code. Here they are.

Touring Gadget

Have you ever wondered which of your favorite bands are coming to town? The Touring gadget, by Martin Mroz, makes finding out easy. You enter your location, and using a simple Google Desktop API and the music community website JamBase, the Touring gadget shows you which of your favorite bands are coming to your town soon.

Touring gathers your favorite bands by using the Google Desktop Query API. When Google Desktop indexes the user's files, it extracts metadata from music files and stores them. Touring queries for music files and pulls out the artist. It only takes a few lines of code to get this data.

Multiplayer Reversi

Playing a game with your friends around the world isn't hard if the game uses the gadget GoogleTalk API. Multiplayer Reversi, by Turhan Aydin, illustrates this point and received lots of "oohs" and "ahhs" when Mihai presented it in San Jose. You select your friend to play with, they confirm, and you start to play reversi. If this sounds difficult, don't worry, it isn't: look at the code snippets.
Just days after the event, excited developers are submitting gadgets. We hope you developers out there will think about using the power of the APIs to make new and interesting gadgets that look great and empower users. If you're looking for gadgets to use yourself, go here to find all Google Desktop gadgets.