Movielens Database DRUPALIZED!`

Events happening in the community are now at Drupal community events on www.drupal.org.
Scott Reynolds's picture

OK,

Finally finished it. Their happens to be a permission problem still though. Users not uid=1 will not be able to view the node that is recommended or access the recommendations page. Pretty sure you can fix that in the admin panel but havent done that yet. If any can shed some light on it please do. Here is where you can find the stuff: http://scottreynolds.us/Downloads/movielenssql.tar.gz

100k movielens is not done, it is in a different file format thereby requiring a different program.
Info from readme file:

These files contain 1,000,209 anonymous ratings of approximately 3,900 movies
made by 6,040 MovieLens users who joined MovieLens in 2000.

Please remember then when using this to test the module it takes MANY MANY cron runs to fully install the module it. Please play with the settings in admin/settings/cre to adjust the number of votes and report any findings (PLEASE!).

Info on the java program

The program itself generates .sql files. There are 3 classes and each generate their repesective .sql files (users.sql, movies.sql, votes.sql). Testcases.txt is a selection of randomly selected votes that WERE NOT added to the database. These should be used to check the alogrithm. To adjust the number of test cases you must adjust the % chance of being selected. That decimal is located at line 56 column 74 in the votes.java file ( I know it should be a commandline option... i just forgot to implement). Please make sure you use the .dat files located in the root of my sandbox and not in the movielens1mill folder. I had to alter movies.dat because of conjunctions like "You're" would mess up the sql statement.

Program was created using NetBeans IDE so there is an ANT script as well. In the dist folder there is a jar file.

Of course please remember the votes.sql is HUGE! and it should take awhile to download. Any performance issuses and sugguestions are welcome.

Comments

Scott Reynolds's picture

deleted

UGG! sorry found my error.

Scott Reynolds's picture

UGG! sorry found my error. was using less then sign followed by dashes ( which is of course a comment in html....).

Forgot to mention!!!

Scott Reynolds's picture

Forgot to mention that all user accounts are Dynamite$uid. and the password is dynamite

feedback this week

schavester's picture

will start to play around with this data this week and give you a report.

good work.

richard

Hmm -- removed files from sandbox

Scott Reynolds's picture

I got yelled at :-D for putting this stuff in my sandbox. People new to contrib cvs would have to download all those files.

Moved them here: http://scottreynolds.us/Downloads/movielenssql.tar.gz

There is no java program or .dat files there. Sorry. If your interested in those let me know and i will post them publically.

Can't download

huymq85's picture

Sorry ,I can't download movielens database,Why?.Help me ,please

Im sure I removed them

Scott Reynolds's picture

Im sure I removed them because it is a rather HUGE database.

Can you put it up on

christefano's picture

Can you put it up on RapidShare or some place so we can see your hard work?

Well you I found it. Well

Scott Reynolds's picture

Well you I found it. Well sort of. check out my sandbox: http://cvs.drupal.org/viewvc.py/drupal/contributions/sandbox/scottreynol...

Run that code (main.class) against this dataset: http://www.grouplens.org/node/73

The file of movielens database can not use

zhaokai's picture

When I unzip the file, an error, and "The file is damaged",Could you sent me the file through email, my email address is "zhaokai09@gmail.com". Thank you very much!

Voting Systems

Group organizers

Group notifications

This group offers an RSS feed. Or subscribe to these personalized, sitewide feeds:

Hot content this week