Skip to main content

Notice

Please note that most of the software linked on this forum is likely to be safe to use. If you are unsure, feel free to ask in the relevant topics, or send a private message to an administrator or moderator. To help curb the problems of false positives, or in the event that you do find actual malware, you can contribute through the article linked here.
Topic: Python Grabber scripts (Read 82272 times) previous topic - next topic
0 Members and 2 Guests are viewing this topic.

Python Grabber scripts

Reply #25
Aren't you from Canada?

It looks easy to find the lyrics response, but the lyrics are presented in Flash, so impossible with python grabber. Sorry


I'm from Canada, but I can read/understand Japanese , also I lived in japan for 1 year and accumulated a pretty big collection of Japanese music!

It's too bad about the above site. Actually, I found an alternative site that uses the same back-end for Lyrics but doesn't seem to use flash.
Here it is: http://music.goo.ne.jp/lyric/index.html

1/2 way down the page you'll see a search box labelled: ?? - ????
Same concept as the other site, if you select ?? you can search by song. (??????? is for searching by artist name)
The button you need to click to search is labelled: ????

Hope you can do something with this one!
Thanks again

Python Grabber scripts

Reply #26
I'll try this one, it should be OK (although don't know if characters will mess something but will see)

Do you know how many lyrics they have?

Python Grabber scripts

Reply #27
I'll try this one, it should be OK (although don't know if characters will mess something but will see)

Do you know how many lyrics they have?



Thanks!
Well according to uta-net, they have 84,000 songs in the database.
I got tons of obscure Japanese songs, for example stuff from the 1960s, etc, and every single one I was able to find in the database.

Thanks again!

Python Grabber scripts

Reply #28
Benji99, here is the script:

[attachment=5458:goo.rar]
I've tested it by tagging some files with title/artist present on site, and it worked. If you find any problems post

What I've learned?
- Japanese glyphs have many encodings
- some sites don't like python
- even more about Unicode

Python Grabber scripts

Reply #29
Once again AMG scripts

Now GENRE, STYLE, MOOD and THEME can be assigned at once with:
[attachment=5477:AMG_Release.rar]
and new AMG review with custom user-agent report, loosen artist match and option to print some info in console:
[attachment=5478:AMG_Review.rar]

Here is example for AMG_Release:

1. Select custom tag in python grabber settings:



2. run the script and update files

3. select Properties > Tools > Automatically fill values
    source: Other and your custom tag
    pattern: [font= "Courier New"]Genres: %genre% \\ Styles: %style% \\ Moods: %mood% \\ Themes: %theme%[/font]



4. then remove AMG tag and with Ctrl - click select newly added tags (GENRE, STYLE, MOOD and THEME) and select "Split values" then OK

If we have GENRE and STYLE tags and don't want to update them, than we enter this pattern i.e.: [font= "Courier New"]%tmp% \\ Moods: %mood% \\ Themes: %theme%[/font] so that GENRE and STYLE remains untouched

As a reminder all AMG scripts rely mostly on correct release (%album%) name
And do comment about problems, I'm rewriting this scripts as I run to some inconsistencies

Python Grabber scripts

Reply #30
Download this AMG release script:
[attachment=5480:AMG_Release.rar]
Problem with previous here: http://www.hydrogenaudio.org/forums/index....st&p=666954

Now use this pattern:
Genres:%genre% \\ Styles:%style% \\ Moods:%mood% \\ Themes:%theme%


Python Grabber scripts

Reply #32
Enjoy
I didn't forgot about composer/performer conversation, I'll post that soon

Here is masstagger script for cleaning the %amg% tag (Canar's version): just run it after the script (if %genre% and %style% should be preserved delete first two action from masstagger script):
[attachment=5484:AMG_release_MTS.rar]

Python Grabber scripts

Reply #33
Enjoy
I didn't forgot about composer/performer conversation, I'll post that soon

Thank you very much... I will wait on tagging any Various Artist Albums until that one comes.

Here is masstagger script for cleaning the %amg% tag (Canar's version): just run it after the script (if %genre% and %style% should be preserved delete first two action from masstagger script):
[attachment=5484:AMG_release_MTS.rar]


I was just in the process of trying to figure out how to use Masstagger to do this.. they timing on this script is perfect!

By the way, I have been using the Python scripts on a few albums this morning and they are working great!

Python Grabber scripts

Reply #34
Benji99, here is the script:

[attachment=5458:goo.rar]
I've tested it by tagging some files with title/artist present on site, and it worked. If you find any problems post

What I've learned?
- Japanese glyphs have many encodings
- some sites don't like python
- even more about Unicode



Huge thanks for this script!!
It works really well, except for a couple small bugs, if you have some free time, ...

1st bug:
Certain track titles make the script crash.
Code: [Select]
foo_grabber_python: Traceback (most recent call last):
  File "I:\Program Files\foobar2000\pygrabber\scripts\goo.py", line 63, in Query
    raw_title = handle.Format('[%title%]').decode("utf8").encode("euc_jp")
UnicodeEncodeError: 'euc_jp' codec can't encode character u'\uff5e' in position 13: illegal multibyte sequence


This seemingly happens when a track has the '~' character in the title,

A couple examples:
Track title: HIGH G.K LOW ~????~
Artist: GreeeeN

Track title: ?? ~????~
Artist: GreeeeN


Although, this one works:
Track title: ??~?????????????~
Artist: THE BOOM


2nd bug, The script seems to have trouble finding tracks where there's a large amount of tracks with the same name

For example:
Track title: YOU
Artist: ??????????

Track title: ?
Artist: ??????????

I know how this 2nd bug can be fixed I think, I found out that the site has a more advanced search function:
http://music.goo.ne.jp/lyric/db.php
There you can enter both the artist (???????) and track title (??)
If you can modify the script to use that page instead, it would make it really accurate!

Huge thanks again!
Sebastien







Python Grabber scripts

Reply #35
1st bug:
Certain track titles make the script crash.
Code: [Select]
foo_grabber_python: Traceback (most recent call last):
  File "I:\Program Files\foobar2000\pygrabber\scripts\goo.py", line 63, in Query
    raw_title = handle.Format('[%title%]').decode("utf8").encode("euc_jp")
UnicodeEncodeError: 'euc_jp' codec can't encode character u'\uff5e' in position 13: illegal multibyte sequence


This seemingly happens when a track has the '~' character in the title

Is that happening only with that character? It can be easily fixed if so.
That character is fullwidth tilde "?" not ordinar tilde "~".

2nd bug, The script seems to have trouble finding tracks where there's a large amount of tracks with the same name

Yeah, I would expect that, because script only tries to find match in first result page, and there can be more pages for some common title names.
I'll check your suggestion, and try to make the script better

Python Grabber scripts

Reply #36
@2E7AH:
I think replace the u'\uff5e' is a workaround:
Code: [Select]
s = handle.Format('[%title%]').decode("utf8")
raw_title = string.replace(s, u'\uff5e', u'\u301c').encode("euc_jp")

Python Grabber scripts

Reply #37
Is that happening only with that character? It can be easily fixed if so.
That character is fullwidth tilde "~" not ordinar tilde "~".


Oops, forgot to respond to this, whenever it crashes, that character is always in the the track title.
Thanks

Btw, as far as making a more complete AMG script. Since I wrote The Godfather scripts for this already years ago. There's a few inconsistencies with the site. For example, the way to displays the performer and composer. It changes sometimes, in particular, it handles Various Artists albums and albums where a few tracks are collaborated by 2nd performer differently. If you can read Delphi and interested in my logic for how I coded around it, drop me a PM with your email, I'll send them to you

I've been wanting to update it in Python but I found Python really hard to read/understand...

Python Grabber scripts

Reply #38
2E7AH, i'm trying to use your AMG script, but have trouble with the "split values" step. the values don't seem to be splitting. when i set up a Filter to show %mood%, for example, the entries are not separate, and i get things like "Uncompromising; Fiery; Literate; Cerebral; Brooding" all on one line.

what am i doing wrong?

Python Grabber scripts

Reply #39
2E7AH, i'm trying to use your AMG script, but have trouble with the "split values" step. the values don't seem to be splitting. when i set up a Filter to show %mood%, for example, the entries are not separate, and i get things like "Uncompromising; Fiery; Literate; Cerebral; Brooding" all on one line.

what am i doing wrong?


I think you need to make sure that "MOODS" is listed as a Multivalue field in Preferences/Advanced/Display/Properties Dialog

Python Grabber scripts

Reply #40
I have a quick question regarding the python discogs genre/style grabber scripts.

How do you know you've gone over the 5000 limit? Does the lookup just fail?

edit :

And the AMG script gives me reviews in the AMG tags

Example :

Quote
New horizons in historic jazz reissuing were revealed in 2005 when Jazz Oracle came out with a double-CD compendium of recordings made for about a dozen different labels between October 1924 and February 1933 in Vienna, Paris, and Berlin, all involving bandleader Lud Gluskin (1898-1989). Andreas Schmauder, apparently one of the world's leading Gluskin authorities, was asked to paw through literally hundreds of 78 rpm platters to designate the 48 titles included in this package, which is loaded with precious photographs and fascinating information. Gluskin first appears as a drummer with Paul Gason and His Versatile Orchestra. "Ain't She Sweet?" is performed by the Playboys, a Detroit-based band that would soon morph into an expanded and more versatile orchestra under Gluskin's direction. Subsequent billings list the perpetually evolving group as Lud Gluskin and His Versatile Juniors, Lud Gluskin et Son Jazz Orchestre Lud Gluskin, "Lud" Gluskin Ambassadonians, Lud Gluskin and his Ambassadors Orchestra, Jazz-Orchester Lud Gluskin, and finally Lud Gluskin et son Orchestre, which is the name they appeared under most often when serenading patrons at the Casino de Paris. The sound of the band often brings to mind great old-time jazz heroes like the Original Memphis Five, Red Nichols, Miff Mole, Bix Beiderbecke, Frankie Trumbauer, Frank Teschmacher and Jean Goldkette, whose arrangements were in fact used by Gluskin from time to time. Material ranges from hot novelty dance music and traditional pop tunes to substantial jazz numbers like "Tiger Rag," "Milenberg Joys," "Clarinet Marmalade," W.C. Handy's "St. Louis Blues," Fats Waller's "Whiteman Stomp," and Fud Livingston's "Feelin' No Pain." Jazz Oracle continues to astonish and delight all who are fascinated with obscure jazz records from the early 20th century. This installment is particularly rewarding.

AMG Review by arwulf arwulf


This was in one of my files' AMG tag!


edit again :


You had another script for looking up tags from Last.FM. Using this would cause inconcistency with the discogs way of tagging which has a more strict set of styles and genres allowed. Would it be possible somehow to tag using the last.fm method only if the style/genre appears in the discogs list of styles and genre, for example if such a list was saved as a .txt file?

 

Python Grabber scripts

Reply #41
Would it be possible somehow to tag using the last.fm method only if the style/genre appears in the discogs list of styles and genre, for example if such a list was saved as a .txt file?

Script like that exists for Picard tagger, and it seems reasonable because last.fm tags are mess. I don't use last.fm script, and I'm not interested in making it happen, but script is there so maybe you can extend it a bit

Python Grabber scripts

Reply #42
By the way, I did find this little piece of information posted on the discogs forums about a month ago :

Quote
It's in our plans to remove the 5000 per day limit. That should happen within the next 2 months.
Source!

That should greatly enhance the usefulness of your discogs scripts

Python Grabber scripts

Reply #43
I've now tried playing around with your last.fm script to see if I can tailor it to fetch other sorts of last.fm info. However, my lack of knowledge with Python is a hindrance! I've changed the script slightly so that it fetches similar artists.

Here's an example of the information the script can get : http://ws.audioscrobbler.com/2.0/?method=a...ac220b7b2e0a026

I've set a limit to 3 artists. However, I have a hard time getting all three artists - I can only fetch one, just like you only wanted the top tag. So my question is, what do I have to do with this piece of code so that I can fetch all three similar artists?

Code: [Select]
                child = doc.getElementsByTagName("artist")[0]
                toptag = child.getElementsByTagName("name")[0]
                lyric = toptag.childNodes[0].data.encode('utf_8').capitalize()

I see changing the value between the brackets in the "("artist")[0]" snippet changes the information retrieved to the next artist. However, I can't figure out how to get them all in one go!

Python Grabber scripts

Reply #44
look at the other last.fm script, i.e.

Code: [Select]
toptags = child.getElementsByTagName("tag")
tags=[]
for i in toptags:
    tags.append(str(i.getElementsByTagName("name")[0].toxml()).replace('<name>','').replace('</name>','').capitalize( ))
lyric=str(tags).strip('[]').replace(',', ';').replace('\'','')


try something like that

Python Grabber scripts

Reply #45
look at the other last.fm script, i.e.

Code: [Select]
toptags = child.getElementsByTagName("tag")
tags=[]
for i in toptags:
    tags.append(str(i.getElementsByTagName("name")[0].toxml()).replace('<name>','').replace('</name>','').capitalize( ))
lyric=str(tags).strip('[]').replace(',', ';').replace('\'','')

try something like that

Thanks for the suggestion  I now get a syntax error in some of the code you posted. To make it a little easier, here's a larger snippet with your suggestion pasted in :

Code: [Select]
            artist = handle.Format("[%artist%]")
            title = handle.Format("3")

            try:
                string=urllib.urlopen('http://ws.audioscrobbler.com/2.0/?method=artist.getsimilar&artist=' + artist.lower().replace(' ','+') + '&limit=' + title.lower().replace(' ','+') + '&api_key=' + api_key).read()
                doc = minidom.parseString(string)
                toptags = child.getElementsByTagName("tag")
                tags=[]
            for i in toptags:
                tags.append(str(i.getElementsByTagName("name")[0].toxml()).replace('<name>','').replace('</name>','').capitalize( ))
                lyric=str(tags).strip('[]').replace(',', ';').replace('\'','')
                result.append(lyric)
            except Exception, e:
                traceback.print_exc(file=sys.stdout)
                result.append('')
            continue

        return result

if __name__ == "__main__":
    LyricProviderInstance = LastFm_TopTag()

Python Grabber scripts

Reply #46
I just pasted that part from that other last.fm script, it wasn't intended for literal use, but as example
Look at your XML response: there isn't "tag" node anywhere, so use what you need, and play a little

[edit] to help you a bit:

where is your "child" variable that you are calling with "toptags" and pasting to look for "tag" node (which doesn't exist BTW in XML response as said)?
why "doc" isn't called?
name those variables meaningful to you

you can explicitly set title, no need for "handle.Format" if you don't need info for some tag to be provided

Python Grabber scripts

Reply #47
I just pasted that part from that other last.fm script, it wasn't intended for literal use, but as example
Look at your XML response: there isn't "tag" node anywhere, so use what you need, and play a little

[edit] to help you a bit:

where is your "child" variable that you are calling with "toptags" and pasting to look for "tag" node (which doesn't exist BTW in XML response as said)?
why "doc" isn't called?
name those variables meaningful to you

you can explicitly set title, no need for "handle.Format" if you don't need info for some tag to be provided

Actually, after your last post, I got somewhat confused and thought I'd try and break it down to something I could understand. I've never looked at Python coding before today, so be aware that much of the code - especially what you pasted - is completely foreign to me. This is something rough I managed to do on my own :

Code: [Select]
string=urllib.urlopen('http://ws.audioscrobbler.com/2.0/?method=artist.getsimilar&artist=' + artist.lower().replace(' ','+') + '&limit=' + title.lower().replace(' ','+') + '&api_key=' + api_key).read()
doc = minidom.parseString(string)
child = doc.getElementsByTagName("artist")[0]
toptag1 = child.getElementsByTagName("name")[0]
child = doc.getElementsByTagName("artist")[1]
toptag2 = child.getElementsByTagName("name")[0]
child = doc.getElementsByTagName("artist")[2]
toptag3 = child.getElementsByTagName("name")[0]
lyric1 = toptag1.childNodes[0].data.encode('utf_8').capitalize()
lyric2 = toptag2.childNodes[0].data.encode('utf_8').capitalize()
lyric3 = toptag3.childNodes[0].data.encode('utf_8').capitalize()
result.append(lyric1)

This gives me 3 variables - lyric1, lyric2, lyric3. They are the names of the artists I'm trying to fetch. I can change the variable appended in the last line to call the different bands/artists. However, if I try to append several, for example by doing this :

Code: [Select]
result.append(lyric1)
result.append(lyric2)
result.append(lyric3)

.. It finds the tags, but applying them causes foobar to crash completely!

Sorry for abandoning your example. Your added hints seem helpful so I will look it over again. However, if you want to help me out with what I got to do to get lyrics1, 2 and 3 appended to the result variable, I'll be happy.

Python Grabber scripts

Reply #48
hey, why ride motorcycle when I have my bike 

leave it for tomorrow

BTW %lastfm_similar_artist% is already provided by biography view component for nowplaying artist

Python Grabber scripts

Reply #49
hey, why ride motorcycle when I have my bike 

leave it for tomorrow

BTW %lastfm_similar_artist% is already provided by biography view component for nowplaying artist

Don't worry about it! I made something that works, although I'm sure the code will make any regular Python programmer wince.

To summarize, this (horrible, but working) modification of 2E7AH's Last.FM genre-tagging script will fetch the 3 most similar artists from last.fm and write them to a tag, f.ex "Hidria Spacefolk; Gong; Kingston Wall".

Code: [Select]
import urllib
from xml.dom import minidom
from encodings import utf_8
from grabber import LyricProviderBase

class LastFm_TopTag(LyricProviderBase):
    def GetName(self):
        return "LastFm Similarity"

    def GetVersion(self):
        return "0.1"

    def GetURL(self):
        return "http://ws.audioscrobbler.com/"

    def Query(self, handles, status, abort):
        result = []
        api_key = 'b25b959554ed76058ac220b7b2e0a026'

        for handle in handles:
            status.Advance()

            if abort.Aborting():
                return result

            artist = handle.Format("[%artist%]")
            title = handle.Format("3")

            try:
                string=urllib.urlopen('http://ws.audioscrobbler.com/2.0/?method=artist.getsimilar&artist=' + artist.lower().replace(' ','+') + '&limit=' + title.lower().replace(' ','+') + '&api_key=' + api_key).read()
                doc = minidom.parseString(string)
                child = doc.getElementsByTagName("artist")[0]
                toptag1 = child.getElementsByTagName("name")[0]
                child = doc.getElementsByTagName("artist")[1]
                toptag2 = child.getElementsByTagName("name")[0]
                child = doc.getElementsByTagName("artist")[2]
                toptag3 = child.getElementsByTagName("name")[0]
                lyric = toptag1.childNodes[0].data.encode('utf_8').capitalize() + ('; ') +  toptag2.childNodes[0].data.encode('utf_8').capitalize() + ('; ') + toptag3.childNodes[0].data.encode('utf_8').capitalize()
                result.append(lyric)
            except Exception, e:
                traceback.print_exc(file=sys.stdout)
                result.append('')
            continue

        return result

if __name__ == "__main__":
    LyricProviderInstance = LastFm_TopTag()