Hi,
i am doing conceptional works/thinking about building up a photo page. The idea is to grant all users access on all images up to 3000px (traffic limit ony daily basis, because i dont want users to exploit and batchdownload all images). Resolutions bigger than 3000px are only for premium mebers.
I´ve talked to some people and all said that if i use private files it will be a performance killer, with my idea. I was told to use public and somehow force it to get permissions work (like building own fileapi or custom cck fields). To check permissions of files i only know hook_file_download() and that is only used when using private files.
As far as i know i need to use the private download method for this, but this will increase the performance load heavily. So if i display a page with 90 thumbnails (selection between 30/60/90), drupal will bootstrap 90 times and will heavily slow down.
If i don´t use private files, files are served directly via http and every user can access the image if he knows the url.
Server:
- Rootserver on debian
- Not sure about the hardware yet but for the start i thought about:
i7-920, 8 GB DDR3 RAM , 2 x 750 GB SATA-II HDD (Software-RAID 1)
Categorization:
- core taxonomy
Optimisations:
- use solr for search
- use mod_xsendfile to modifiy the headers and decrease the load a bit.
- use boost/memcache/cacherouter/apc
What to you think about the Pressflow Drupal disto?
I am willing to spent a lot of time on this and/or coding some extra stuff. I only need to have some kind of assurance that this is possible with drupal(reasonable) and going in the right direction. I´ve read a lot about optimizations where they hacked core to achive some tunings, is that really neccessary? I´m not a big fan of hacking core.
If its not descriptive enough, please tell me and i will rewrite my description.
ps. this is not a project for some company, its a private one man thing :)
Thank you very much for you help/hints/tips,
Marco

Comments
You could also physical
You could also physical separate light and heavy images (Light images supposed to be freely available).
Then in you htaccess you write a rewrite rule that bypass the private "system" path and give direct access to the images.
something like
RewriteCond %{REQUEST_URI} ^system/files/lightimages.$
RewriteRule ^system/files/(.)$ /files/$1
and no core hack ...
good idea. but the light
good idea. but the light images(free) are still up to 3000px, so if the users know the filesysten they could download as many as they want or even batch download them with a script.
You should also look into
You mentioned using X-Send. There's a module for it: http://drupal.org/project/xsend. You'd still do bootstraps to check access rights, but the image will then be sent by the web server instead of through php so it should help quite a bit. With some changes, the module can also be used by nginx. http://groups.drupal.org/node/36892
And if you're fitting 90
And if you're fitting 90 thumbnails on a page then you probably want to just go public download on thumbnails. I think that using private and public downloads at the same time may be included in D7, but you'll have to work with contrib modules or custom code to make it happen in D6.
mixing both, private/public
mixing both, private/public would do it. what i wonder is, how to do access check without hook_file_download (only is called in private mode).
If this is just a personal
If this is just a personal thing, I think you might be a little over worried about the performance. Where are you going to put the server? I think some cheap Shared Hosting might work for a while, and then if this really takes off you could put the thousands in for a dedicated server? Just my thoughts,
Thomas Hansen
www.ThomasHansen.me
with personal i mean that i
with personal i mean that i am making this alone. if i put a lot of free 3000px images on it, i need much space, no shared hosting i know has enough space not really. you need access to root to install mod_sendfile, modify phi.ini and other stuff so i am going to get a dedicated server.
i´ve set up my test server(debian) here in my lan.
I see, I've got 50 GB on a
I see, I've got 50 GB on a GoDaddy shared hosting for about $5 per month and they say I can go as large as 150 GB. Yeah you won't be able to do nearly as many customizations on a shared host, but for a basic photo site, I think this still sounds like serious overkill. Also, most shared hosts allow you to make a few changes to a local php.ini. Where are you going to put the server? What kind of a uplink speed is it going to get?
Thomas Hansen
www.ThomasHansen.me
i rent the server and the
i rent the server and the uplink is 100mbit, traffic up to 2 TB then, down to 10mbit
i have UNLIMITED DISKSPACE
i have UNLIMITED DISKSPACE and BANDWITH in a shared hosting...
Unlimited?
You might want to read the fine print in the hosting agreement. In my experience, there are a number of back door limits (e.g. inodes, type of files, type of sites, etc.) that effectively make "UNLIMITED" nothing more than a marketing gimmick. Note also that the memory limits are typically set low on shared hosting and that many (most?) Drupal sites will hit memory limitations before approaching the other limits.
i was joking :)
i was joking :)
Parallel Module
http://drupal.org/project/parallel will help with faster downloads if using a private filesystem. At least thats the word on the street.
What do you mean with "word
What do you mean with "word on the street"? You made that module :)
I don't use private download ever
None of my sites I run use private downloads. I can not personally verify that it does, but I have had multiple reports of it working wonders if using private downloads.
Idea
I finally thought about it and had a module idea: http://drupal.org/node/819742
What do you think of this concept?
I thought about mkalbere's
I thought about mkalbere's suggestion at #2 and how you'd do this in nginx.
File system would be
---home--- www.mysite.com
--- drupal
--- private
and drupal would sit at /home/www.mysite.com/drupal and files would be private and set to /home/www.mysite.com/private.
You'd upload original photos to private/pics using a cck filefield or whatever. Then you'd have imagecache presets for light and thumbnail. Originals would be served at http://www.mysite.com/system/files/pics/0001.jpg and light would be at http://www.mysite.com/system/files/imagecache/light/pics/0001.jpg and thumbnails would be at http://www.mysite.com/system/files/imagecache/thumbnail/pics/0001.jpg.
Then you'd create an nginx config file location like this:
location ^~ /system/files/imagecache/thumbnail/pics/ {alias /home/www.mysite.com/private/imagecache/thumbnail/pics/ ;
access_log off;
expires 45d;
error_page 404 @drupal;
}
and that would work fine... IF the cached file already exists. If the thumbnail doesn't exist yet, then an anonymous user is going to get a 403. A registered user would go to the same location and imagecache would generate the file. So you'd have to prewarm the imagecache with each new photo before it'd appear for anon users.
thank you brian. The nginx
thank you brian.
The nginx config seems to be similar to a rewrite. I recently discovered: http://drupal.org/node/796384
You can access imagecache preset from the non clean urls like:
http://localhost/dev/imagecache/sites/default/files/imagecache/test/img.jpg
http://localhost/dev/imagecache/?q=sites/default/files/imagecache/test/i...
the image is served!! (WOW!!)
So my idea to use a rewrite is torn apart :)
I´m thinking about writing an own cck field that saves files outside the www root, for example in var/www and modify/remodel imagecache, so that you could choose which presets to save in public/private an which presets to protect. This will be hard but seems neccessary.
my thoughts on the issue
Best idea so far: force all protected files to end in *-private.jpg & regex filter on that
http://drupal.org/node/819742#comment-3066522
i read you post/snipped. I
i read you post/snipped. I need to hack imagecache to add those -private prefixes to some presets.
Weird bug. I'm gonna have to
Weird bug. I'm gonna have to give that a try.
Setting up the files/
Setting up the files/ directory outside the web root takes care of that first scenario. The second scenario with files/ outside the web root is definitely a security hole in imagecache, but hopefully they'll have the hole patched soon and then the rewrite method should work.
This would be the schema. But
This would be the schema. But im sure if i do it like that, not a single contribute module using files will work anymore.
http://www.screencast.com/users/e-anima/folders/Jing/media/281dd2cd-15d5...