Victor's Computing Space: March 2010

Saturday, March 27, 2010

php server (on Win7 ) working finally

Lots of Windows PHPers recommend WAMP for building up the PHP server + MYSQL + Apache on Windows, which is actually not preferable for opensource PHP so sometimes it takes a while to set up, unlike on unix.

Today i wasted sometime on this.

Originally I've installed PHP5 and MYSQL before, so WAMP is actually installing everything within its own package into its folder.

Beware! You should uninstall the previous PHP , clear and clean , to reduce troubles.....

My case is, I uninstalled the old PHP5 on Control Panel, then go ahead and install WAMP, everything works just fine, except a weird bug!

Whenever how i config the files in wamp, the php NEVER print out the error message!!!!

I searched around and still cannot find the solution, until 10 min ago, I thought maybe the uninstallation is not finished completely? Then I took a look at the old PHP folder! It's still there!!! Besides, the php.ini is in that old folder!!!

Oh Jesus Christ!!! xxxxxxxxxxxx

I deleted the whole old folder, and restart wamp, bingo! It works !

Sunday, March 21, 2010

some engineering ways to hack MD5 hash

Found some interesting websites having attempted to practically hack MD5:

5MB words:
http://www.md5decrypter.com/

7GB words:
http://www.md5decrypter.co.uk/

MD5 will take whatever length of string, and hash it into a 128bit value as "signature" for that string.

Practically, if we store all these 128 bit values, and use it as index to build a database, and item value as the short password, this would take space complexity of:
2^128 = 10^38

which is too large practically, but if we can "hash" it again using the "md5_128bit_value" as the key, and item values as the originally cleartext, then bingo!

Patrick also mentioned that we could first sort these 128 bit key, and then do a binary search for the given query "md5_128bit_value" . But it still takes too much space ...... up to 10^38....

Hmmmm... A lot of forums are using MD5 for encrypting the passwords, it would be wise to test your MD5 value for your password in those MD5 hacker websites before you hand it over to your forum...... like

www.ucbbs.com

Wednesday, March 17, 2010

NETFLIX PRIZE 1M$ gone! Sep 2009.

Amazingly and finally this prize is won by Yahoo! Research Lab!

Machine learning seeks the recommendation out from the chaos of the Netflix huge dataset!

http://www.netflixprize.com/community/viewtopic.php?id=1537

It is our great honor to announce the $1M Grand Prize winner of the Netflix Prize contest as team BellKor’s Pragmatic Chaos for their verified submission on July 26, 2009 at 18:18:28 UTC, achieving the winning RMSE of 0.8567 on the test subset. This represents a 10.06% improvement over Cinematch’s score on the test subset at the start of the contest. We congratulate the team of Bob Bell, Martin Chabbert, Michael Jahrer, Yehuda Koren, Martin Piotte, Andreas Töscher and Chris Volinsky for their superb work advancing and integrating many significant techniques to achieve this result.

The Prize was awarded in a ceremony in New York City on September 21st, 2009. We will post a video on this forum of the presentation the team delivered about their Prize algorithm. In accord with the Rules the winning team has prepared a system description consisting of three papers, which we both make public below.

Team BellKor’s Pragmatic Chaos edged out team The Ensemble with the winning submission coming just 24 minutes before the conclusion of the nearly three-year-long contest. Historically the Leaderboard has only reported team scores on the quiz subset. The Prize is awarded based on teams' test subset score. Now that the contest is closed we will be updating the Leaderboard to report team scores on both the test and quiz subsets.

To everyone who participated in the Netflix Prize: You've made this a truly remarkable contest and you've brought great innovation to the field. We applaud you for your contributions and we hope you've enjoyed the journey. The Netflix Prize contest is now closed.

We will soon be launching a new contest, Netflix Prize 2. Stay tuned for more details.

The winning team’s papers submitted to the judges can be found below. These papers build on, and require familiarity with, work published in the 2008 Progress Prize.

Y. Koren, "The BellKor Solution to the Netflix Grand Prize", (2009).

A. Töscher, M. Jahrer, R. Bell, "The BigChaos Solution to the Netflix Grand Prize", (2009).

M. Piotte, M. Chabbert, "The Pragmatic Theory solution to the Netflix Grand Prize", (2009).

Tuesday, March 16, 2010

Google interview question: Throw 2 eggs on 100 storied building

Google interview question: Throw 2 eggs on 100 storied building, and decide which exact level and its above is going to break the egg.

Underlying fact: if the thrown egg is unbroken, actually you could grab it and reuse it!

Ravi and I spent sometime today discussing it, with different solutions.

1, binary search is optimal when you have lots of eggs and achieving log2(n) complexity, but it's not the best way for this condition : only 2 eggs.

2, linear scanning. Assume the 100 level building is segmented into sections length of x, then we have floor(100/x) sections. First, start from the x th level and throw the 1st egg, if it is not broken, then go up x levels. If it breaks, then going inside that section below, and start from the bottom of that section, linearly upward until the egg breaks.

The number of trials f(x) in worst case is written as

f(x) = floor(100/x) + x;

it's easy to see that the optimal f(x) happens when x = 10, and f(x) = 20.

Yet, it's good enough, but not the optimal solution for this problem!

3, notice that the above solution can be seen as "double linear" scanning, which is something we will attack in this improved version:

Instead of considering equal length sections, notice that what if we make unequal sections? furthermore, how about decreasing # of levels in each sections when going upwards? Also notice that at the beginning, we need to ( almost always) start from the lowest level, why not try to "skip" more at the bottom sections?

Denote "outside" as #trials trying to identify the sections, and "inside" as #trials trying to identify within that section, we have a tradeoff to make here:

"outside" + "inside" == constant

meaning that when you spent more trials on "outside", you should not spent too much trials on "inside", otherwise you are not likely to improve.

Here we go!

Assume we have :

(x) + (x-1) + (x-2) + ... + (1) <=100

where each ( ) is the section length.

solve for:
sum_i=1 ^ x {i}<=100, we could use google calculator to compute:

sqrt(201) = 14.1774469

so the bottom section length is around roughly 14, and the respective section lengths upwards are 13, 12, 11, ....,1.

Bingo! See the magic here?!

so the strategy is similar fashioned, first decide the "outside" section until the 1st egg breaks, then dive inside that below section, and linearly upwards, throw the 2nd egg...

e.g. when the 1st egg breaks at 14th level, we spend 1 trial to decide the "outside" section, then spend the 2nd egg starting throwing from 1st level. So the worst case here is when the level is 13th, then we have to use up 13+1 = 14 trials.

This one is actually the upperbound for our formulation! Remember that tradeoff ?

Therefore we've achieved the "egg salvation" google brainteaser !

Thanks for the show! :D

Saturday, March 13, 2010

play piano notes in Matlab

Here is some Matlab code and function to play a piano note scale, DO, RE, MI, ...., based on Wiki's tone pitch definition.

Quite interesting!

for n = [1,3,5,6,8,10,12,13], sine_tone(440*2^(n/12));end

function sin_tone(freq)
fs=8192;
t=[0:1/fs:1];
y = sin(2*pi*freq*t);
soundsc(y)

Monday, March 1, 2010

Torsten Reil: Animating neurobiologist

http://www.ted.com/speakers/torsten_reil.html

From modeling the mayhem of equine combat in Lord of the Rings: Return of the King to animating Liberty City gun battles in Grand Theft Auto IV, Torsten Reil's achievements are all over the map these days. Software that he helped create (with NaturalMotion, the imaging company he co-founded) has revolutionized computer animation of human and animal avatars, giving rise to some of the most breathtakingly real sequences in the virtual world of video games and movies- and along the way given valuable insight into the way human beings move their bodies.

Reil was a neural researcher working on his Masters at Oxford, developing computer simulations of nervous systems based on genetic algorithms- programs that actually used natural selection to evolve their own means of locomotion. It didn't take long until he realized the commercial potential of these lifelike characters. In 2001 he capitalized on this lucrative adjunct to his research, and cofounded NaturalMotion. Since then the company has produced motion simulation programs like Euphoria and Morpheme, state of the art packages designed to drastically cut the time and expense of game development, and create animated worlds as real as the one outside your front door. Animation and special effects created with Endorphin (NaturalMotion's first animation toolkit) have lent explosive action to films such as Troy and Poseidon, and NaturalMotion's software is also being used by LucasArts in video games such as the hotly anticipated Indiana Jones.

But there are serious applications aside from the big screen and the XBox console: NaturalMotion has also worked under a grant from the British government to study the motion of a cerebral palsy patient, in hopes of finding therapies and surgeries that dovetail with the way her nervous system is functioning.

"It might be surprising to find a biologist pushing the frontiers of computer animation. But Torsten Reil is bringing cheaper, lifelike digital characters to video games and films."

Technology Review