robsite

In imdb

rmdb

IMDb enthält hunderttausende Filme, Serien, Schauspieler, Bewertungen und andere Film-Infos. Da wäre es schön, wenn man diese Daten irgendwie analysieren könnte. Leider gibt's keine bezahlbare API, wohl aber einen automatischen Export der wichtigsten Filmdaten, als Textdateien.

rmdb ist mein kleines PHP-Script, das diese Export-Dateien runterlädt und in eine relationale Datenbank importiert, damit man die Filmdaten per SQL durchsuchen kann.

Damit kann man einige interessante Infos zu Tage fördern:

Verteilung von Genres über die Jahre

SELECT start_year,
       COUNT(*) AS count
FROM   titles
       LEFT JOIN titles_genres AS tg1
              ON tg1.title_id = titles.id
WHERE  tg1.genre_id = "drama" -- repeat for a few genres

       -- Exclude tv episodes etc.
       AND title_type = "movie"

       AND start_year < YEAR(CURRENT_DATE())

GROUP  BY start_year
ORDER  BY start_year ASC; 
Verteilung von Horror, Western und Sci-Fi Filmen über die Jahre

Die 10 produktivsten Regisseure von guten Filmen in den 60ern

SELECT n.primary_name AS name,
       COUNT(*)       AS movies
FROM   principals
       LEFT JOIN names AS n
              ON n.id = name_id
       LEFT JOIN titles AS t
              ON t.id = title_id
WHERE  category_id = "director"
       AND t.title_type = "movie"
       AND t.start_year >= 1960
       AND t.start_year <= 1969
       AND t.runtime_minutes >= 90
       AND t.average_rating > 6
       AND t.num_votes > 10000
GROUP  BY name_id
ORDER  BY movies DESC
LIMIT  10; 
name movies
Jean-Luc Godard 7
Blake Edwards 6
Akira Kurosawa 5
Michelangelo Antonioni 5
Alfred Hitchcock 5
John Frankenheimer 5
Stanley Kramer 4
Robert Aldrich 4
Sergio Leone 4
Roman Polanski 4

Die 50 am besten bewerteten Horror-Komödien

SELECT CONCAT("[", primary_title, "](https://www.imdb.com/title/", id, ")") AS primary_title,
       start_year,
       average_rating,
       num_votes
FROM   titles
       LEFT JOIN titles_genres AS tg1
              ON tg1.title_id = titles.id
       LEFT JOIN titles_genres AS tg2
              ON tg2.title_id = titles.id
WHERE  tg1.genre_id = "horror"
       AND tg2.genre_id = "comedy"

       -- Exclude TV episodes etc.
       AND title_type = "movie"
       
       -- Exclude little voted on movies where average_rating is often too high. 
       -- Higher num_votes = more popular
       AND num_votes > 20000

ORDER  BY average_rating DESC,
          num_votes DESC
LIMIT  50; 
primary_title start_year average_rating num_votes
Shaun of the Dead 2004 7.9 547079
Evil Dead II 1987 7.7 161069
Zombieland 2009 7.6 562503
What We Do in the Shadows 2014 7.6 178609
One Cut of the Dead 2017 7.6 23294
Tucker and Dale vs Evil 2010 7.5 176660
An American Werewolf in London 1981 7.5 104122
Dead Alive 1992 7.5 96468
Stree 2018 7.5 33219
Army of Darkness 1992 7.4 174895
The Rocky Horror Picture Show 1975 7.4 148043
Bhool Bhulaiyaa 2007 7.4 26087
Gremlins 1984 7.3 219049
The Return of the Living Dead 1985 7.3 60200
House 1977 7.3 26640
The Lost Boys 1987 7.2 138911
Re-Animator 1985 7.2 63177
Bhoot Police 2021 7.2 26129
Tremors 1990 7.1 135475
The Frighteners 1996 7.1 87546
Little Shop of Horrors 1986 7.1 74525
The Fearless Vampire Killers 1967 7.1 31521
Dellamorte Dellamore 1994 7.1 21375
Bubba Ho-Tep 2002 6.9 48378
Warm Bodies 2013 6.8 229755
Ready or Not 2019 6.8 141645
Odd Thomas 2013 6.8 52720
Creepshow 1982 6.8 46916
Zombieland: Double Tap 2019 6.7 173135
Trick 'r Treat 2007 6.7 90509
Fresh 2022 6.7 38519
Housebound 2014 6.7 34328
Fido 2006 6.7 29187
Behind the Mask: The Rise of Leslie Vernon 2006 6.7 23800
Night of the Creeps 1986 6.7 23355
Happy Death Day 2017 6.6 136327
Death Becomes Her 1992 6.6 115999
May 2002 6.6 36838
Slither 2006 6.5 81269
Arachnophobia 1990 6.5 69144
The Witches of Eastwick 1987 6.5 69079
Bad Taste 1987 6.5 46735
Better Watch Out 2016 6.5 37711
Fright Night 2011 6.4 105215
Gremlins 2: The New Batch 1990 6.4 102085
Severance 2006 6.4 38791
The People Under the Stairs 1991 6.4 34750
Mayhem 2017 6.4 20928
The Babysitter 2017 6.3 89374
Dead Snow 2009 6.3 66839

Beliebte SciFi-Filme der 70er

Ohne Star Wars, da es nur Action, Adventure und Fantasy Genres hat.

SELECT CONCAT("[", primary_title, "](https://www.imdb.com/title/", id, ")") AS
       primary_title,
       start_year,
       average_rating,
       num_votes
FROM   titles
       LEFT JOIN titles_genres AS tg1
              ON tg1.title_id = titles.id
WHERE  tg1.genre_id = "sci-fi"

       -- Exclude TV episodes etc.
       AND title_type = "movie"

       -- Exclude little voted on movies where average_rating is often too high. 
       -- Higher num_votes = more popular
       AND num_votes > 10000

       AND start_year >= 1970
       AND start_year <= 1979

ORDER  BY average_rating DESC,
          num_votes DESC; 
primary_title start_year average_rating num_votes
Alien 1979 8.5 855270
A Clockwork Orange 1971 8.3 811076
Ivan Vasilyevich Changes His Profession 1973 8.2 16458
Stalker 1979 8.1 131654
Solaris 1972 8 89827
Fantastic Planet 1973 7.7 30480
Close Encounters of the Third Kind 1977 7.6 198472
Superman 1978 7.4 172371
Invasion of the Body Snatchers 1978 7.4 59190
The Andromeda Strain 1971 7.2 36803
Sleeper 1973 7.1 42854
Time After Time 1979 7.1 18387
Soylent Green 1973 7 64000
The Boys from Brazil 1978 7 27798
Westworld 1973 6.9 57182
The Stepford Wives 1975 6.9 17631
Mad Max 1979 6.8 204229
Logan's Run 1976 6.8 56057
The Brood 1979 6.8 29614
Slaughterhouse-Five 1972 6.8 13000
THX 1138 1971 6.7 51240
Phantasm 1979 6.6 36464
Silent Running 1972 6.6 28965
The Man Who Fell to Earth 1976 6.6 26180
Rollerball 1975 6.6 24752
Horror Express 1972 6.5 10758
Star Trek: The Motion Picture 1979 6.4 89032
The Omega Man 1971 6.4 31325
Shivers 1975 6.4 20837
A Boy and His Dog 1975 6.4 17756
Escape from the Planet of the Apes 1971 6.3 35051
Rabid 1977 6.3 18535
The Fury 1978 6.3 15211
Moonraker 1979 6.2 99575
Death Race 2000 1975 6.2 27669
Dark Star 1974 6.2 24315
Conquest of the Planet of the Apes 1972 6.1 32523
The Crazies 1973 6.1 13134
Beneath the Planet of the Apes 1970 6 46579
The Black Hole 1979 5.9 25455
Piranha 1978 5.9 21507
Zardoz 1974 5.8 22467
Futureworld 1976 5.7 10713
Battle for the Planet of the Apes 1973 5.4 30854

Die 10 schlechtesten Regisseure, die trotzdem noch Filme drehen

SELECT CONCAT("[", d_name, "](https://www.imdb.com/name/", name_id, ")") AS name,
       ROUND(avg, 2) AS average_rating,
       movies
FROM   (SELECT name_id,
               n.primary_name                       AS d_name,
               SUM(t.average_rating) / COUNT(*)     AS avg,
               COUNT(*)                             AS movies
        FROM   principals
               LEFT JOIN names AS n
                      ON n.id = name_id
               LEFT JOIN titles AS t
                      ON t.id = title_id
        WHERE  category_id = "director"
               AND t.title_type = "movie"
               AND t.num_votes > 1000
               AND n.death_year IS NULL
        GROUP  BY name_id
        ORDER  BY avg ASC) AS t
WHERE  movies > 10
ORDER  BY avg ASC
LIMIT  10; 
name average_rating movies
Uwe Boll 3.603 29
Fred Olen Ray 3.7 11
Albert Pyun 4.146 28
Bert I. Gordon 4.158 12
Timothy Woodward Jr. 4.191 11
Jim Wynorski 4.25 14
Nico Mastorakis 4.307 15
Charles Band 4.515 13
Steven C. Miller 4.636 11
Gregory Hatanaka 4.727 11
· meins, robsite, imdb, filme ·
Mastodon