Twitter Weekly Updates for 2009-06-28

  • Too bad that Rafa Nadal can’t play Wimblendon. Still remember the match last year Federer-Nadal, very enjoyable #
  • Great group OnStartup in LinkedIn. Found many stuff and info for team building for my #startup #
  • Enough for social network activities and time to deploy the June version of Buscaplus #
  • Done with June version of Buscaplus. Next release for end of summer, the Summer Release #
  • @dexin What type of application corporation cant run in a cloud like Amazon? in reply to dexin #
  • Yesterday I tested intense berkeley db writing activities into a small Amazon #ec2 instance. Results were not great. Tomorrow will test more #
  • @abarrera Inkzee type of entertainment in reply to abarrera #
  • @oudiantebi Wish I had all the net talking when I am going back to work, right? in reply to oudiantebi #
  • @ilde Si, es un sitio web chulo in reply to ilde #
  • Testing amazon images for huge db writing script: 1 Million and 3 Million rows. Will do a post later #
  • This type of huge writing activities are usually performed in my search engine, as well as others #
  • This is what I like most from Amazon, you can test on what you need and come to conclusions if it is for you or not #
  • On my Amazon EC2 findings… “Using Amazon EC2 As Infrastructure For An Internet Search Engine” http://is.gd/1b06h #
  • @abarrera Take a look at my post on Amazon EC2, it seems promising as backend #
  • @Rumford it is my favorite app ok iphone #
  • eureka, I got great idea for defining map of link authority with alternate method. should be online soon #
  • cool, Spain came back from limbo I’m soccer game in sec term #
  • USA played great, very good defense. broken dreams for spain #
  • @lucy_reeding thanks for the iphone info in reply to lucy_reeding #
  • @abarrera Glad you liked it. Thanks for sharing in reply to abarrera #
  • Having in mind that Twitter users from Spain are only 0.63% and most users are US, I would define it as perfect marketing and PR tool for US #
  • Twittering to check what’s going on in net about Michael Jackson #
  • good morning #
  • Starting the development of a new module for Buscaplus. Will allow defining a set of keywords to define link authoring relevance #
  • Playing MJ songs from iPod as tribute. Thanks for all those great moments, since Thriller video to now #
  • Creating the ec2 image for june version and a backup, maintenance activities for the amazon universe #
  • What do you think Michael Jackson will be remembered? 1- His Music, 2- His influence in music (videos, concerts), 3- His scandals #
  • Taking Michael Jackson quiz at CNN http://is.gd/1fXLm #
  • I am having fun taking the quiz #
  • dont know this one… who directed the thriller video? Spielberg, George Lucas, John Landis or Copola #
  • got 8/15 right, guess is ok, was fun, recommend it #
  • @healingsoul why is green? in reply to healingsoul #
  • Listening to the doors. They remind me of university times. Waiting for the time to come to go out to meet a friend #
  • RT @henweb: wow, i see that too http://www.twitpic.com/8gz5m on the us site. oops, google! #
  • RT @dannysullivan posted, Google Thinks Michael Jackson Died At Age 65 In 2007, http://bit.ly/GZSZ5 #
  • i am in Irish rover with my dear friend carlos #

Powered by Twitter Tools.

No Comments

Using Amazon EC2 As Infrastructure For An Internet Search Engine

logo_aws

Today I was doing testing on different Amazon EC2 images with bulk writing activities usually performed in my startup Buscaplus, an Internet search engine framework. Currently I have a set of 4 servers with SATA disks and I am planning to move to Amazon.

We use Berkeley DB as index database engine. It is pretty fast, specially  if you define correctly the memory cache, etc… In Buscaplus we need to write huge amounts of data to disk and bottlenecks are often found due to the high database requirements for a search engine. So this is crucial if we ever move to amazon, speed of writing stuff to disk. A deployment and cloud design for many instances has not been accomplished but with today´s tests seems clear that Amazon EC2 is an option for Buscaplus.

Tests

Berkeley DB writes data in key-> value sets. You can select BTREE as well as other engines. We use BTREE and a cache spool of 128MB for all tests. Also, we write 100 Bytes for each row of data. The keys are simply a counter with zeros on right, like ‘0000000345′.

Sample index-1 ec2 small ec2 large ec2 ultra large ec2 medium ec2 high extra large
1.000.000 13.35 18.60 9.50 9.50 9.00 7.99
3.000.000 39.81 44.62 27.47 26.19 26.14 25.90
20.000.000 Unstable

index-1 shows one of the current servers. I would conclude that the “medium” instance is a great option. At only $0.20 / hour has great performance, better than current infrastructure.

I also found that when dealing with a lot of data, small instance of course are a “no-no”, but also higher instances with local disks. I noticed that when dealing with high I/O even big instances may do bad if load at that time is high. I found that this is not the case when having EBS. With high I/O and EBS I got great results all the time. So I would go for sure with EBS.

The 20 million rows tests were unstable even with a $0.80 High CPU Extra Large instance. This ended up in a DB table of more than 3GB. Read the rest of this entry »

, ,

No Comments

Twitter Weekly Updates for 2009-06-21

Powered by Twitter Tools.

No Comments

Twitter Weekly Updates for 2009-06-14

Powered by Twitter Tools.

No Comments

Twitter Weekly Updates for 2009-05-17

  • @abarrera My brother works at SAP. The funny thing is that he worked previously at Microsoft Iberia. Won’t be happy about merge. in reply to abarrera #

Powered by Twitter Tools.

No Comments