Commit Graph

70 Commits

Author SHA1 Message Date
tooomm
4b11546a8a Revert "test regarding error.yml"
This reverts commit 0cb9bb1948.
2018-05-24 20:22:04 +02:00
tooomm
0cb9bb1948
test regarding error.yml 2018-05-24 19:09:24 +02:00
tooomm
21f66c65b6
remove unneeded time stamp in console log (#179)
* Remove debug/console timestamp
2017-11-02 18:55:10 +01:00
Dave
432ba1d028 Let xml writer handle back sides of DFC 2017-09-18 12:38:39 -05:00
dev-id
7243182be3 Error for manaCost == "" 2017-09-11 12:22:10 -05:00
Dave
075faa71ac Update spoilers.py 2017-09-08 12:49:45 -05:00
Dave
0a258379fb Update spoilers.py 2017-09-08 12:40:36 -05:00
Dave
dea584cb35 [WIP] Double-faced card handling (#144)
* Double-faced card handling

Remove duplicate debug print (no image shows up in error file)

* Handle DFC different than Split

* Don't DFC if no number

Cards with ? for a card number shouldn't be attempted to match for DFC
2017-09-07 21:08:06 -05:00
dev-id
fa10639000 Don't make set files for sets without cards.
Don't put them in spoiler.json
2017-09-07 11:21:06 -05:00
tooomm
0fd2ead70d more future sets info (#143)
Add future sets to setinfo with any available information completed and the rest commented.

noRSS enabled for each, spoiler.rss is current-set-only. As MTGS spoiler.rss gets updated, noRSS should be removed/falsed for the current set. 

* updated with basic info for future sets

source: http://www.mtgsalvation.com/forums/magic-fundamentals/the-rumor-mill/673776-schedule-of-upcoming-releases-and-spoiler-seasons

* added mythicCode to optional keys

* Update set_info.yml

http://magic.wizards.com/en/products/iconic-masters

http://markrosewater.tumblr.com/post/163779974563/wait-so-the-unstable-set-code-is-ust-i-thought

http://magic.wizards.com/en/products/rivals-ixalan

http://magic.wizards.com/en/products/masters-25

http://magic.wizards.com/en/products/dominaria

http://magic.wizards.com/en/products/core-2019

* Update set_info.yml

* Update set_info.yml

* Update mtgs_scraper.py

* Update mtgs_scraper.py

* Update set_info.yml

* more documentation

* Only print set stats for sets with cards

* Move set has cards check for debug print

* Print line if set has no cards
2017-09-07 10:32:42 -05:00
dev-id
da4a8ba28b New WOTC card gallery URL 2017-09-06 12:20:19 -05:00
dev-id
9693dad628 Refactor get_image_urls function to use only setinfo as input. 2017-09-02 21:21:29 -05:00
Cheldra
d042f71009 New set_info value mythicCode (#135)
* New set_info value mythicCode

Mythicspoiler is using IXA rather than XLN for some reason. Creating a new parameter to account for this.

* mythicCode

* mythicCode

* Mtgs name fixes

* Mtgs name fixes

* Set mythicCode to code if not present in setinfo
2017-09-02 10:25:22 -05:00
tooomm
ff26d45bd0 move utc tag in xml file 2017-08-23 11:33:11 +02:00
tooomm
0293efcea0 add timezone to runtime 2017-08-23 11:26:20 +02:00
tritoch
9d5f723fc8 Use scryfall data for C17 (disables all other sources) (#127)
* Use scryfall data for C17 (disables all other sources)

* Allow individual sets to be scryfall-only sourced.

Remove prototyped variables (and add one...)
2017-08-12 23:20:24 -05:00
Zach H
2b41643255 Merge pull request #109 from tritoch/replace-typeline-dash
Replace `-` in type line with `—`
2017-07-11 00:17:52 -04:00
tooomm
a3a3fc74e3 update date/time format (#114)
Remove invalid and redundant datetime tags
2017-07-10 21:21:07 -05:00
tritoch
f74bd35c8d Eliminate 'split_cards' array. Use 'names' key to determine split cards. (#111) 2017-07-07 16:27:50 -05:00
tritoch
846a41d2e2 Error on type not in valid type list (#108) 2017-07-07 16:27:22 -05:00
tritoch
4d62dcf946 Fix costs in text (#107)
* Replace anything found in `{}` with uppercase, single character versions

* Remove debug print
2017-07-07 16:27:00 -05:00
tritoch
291efdb19e Replace - in type line with 2017-07-07 00:20:44 -05:00
tritoch
c13a719944 Remove open parenthesis deletion
Fixes #81
2017-07-06 23:18:17 -05:00
tritoch
9004f6d285 formatting 2017-07-06 22:35:53 -05:00
tritoch
04e6a1892f Verify spoiler.xml against Cockatrice's XSD file (#105)
Verifies spoiler.xml against Cockatrice's XSD file

verify_xml currently takes an xml file and XSD as a string

Prints a pass/fail above XML dump

* Re-order xml writing to pass XSD

* Improved XSD verification

Now prints error

Now handles malformed XML or XSD
2017-07-06 21:37:11 -05:00
tritoch
3ee6aa3842 Merge pull request #106 from tritoch/build-date
Build date in XML
2017-07-06 21:36:16 -05:00
tritoch
dc9b9b7a48 Refactor set_info, download_images to scraper sub
Refactor set_info to align with mtgjson keys.

Move download_images to wizards_scraper
2017-07-06 19:46:26 -05:00
tritoch
c40355f0fb Build date in XML 2017-07-06 17:49:36 -05:00
tritoch
5b987d28cf Change input files to YAML (#99)
Input files to yaml

Deduplicate file verification, move it out to module.

Remove commentjson requirement
2017-07-06 14:25:10 -05:00
Lee Matos
8900c1f8af Remove urllib requirement and replace with requests 2017-07-05 22:43:19 -04:00
Lee Matos
1dd538d5a1 First pass refactoring scrapers into separate modules (#98)
Splits off the respective scrapers into submodules (mtgs_scraper.py, scryfall_scraper.py, mythic_scraper.py, wizards_scraper.py)
2017-07-05 20:44:45 -05:00
tritoch
4e43b90156 Write AllSets.json
Don't scrape scryfall if we disable comparison

Toggle for Dumping Error log
2017-06-30 09:42:36 -05:00
tritoch
d8d31f4aab Scrape mana symbols from WOTC card gallery (#68) 2017-06-30 09:05:43 -05:00
tritoch
876d3a800f Merge pull request #71 from tritoch/split-aftermath-image-match
Better split/aftermath image matching
2017-06-28 20:04:32 -05:00
tritoch
82186fdf2e Better split/aftermath image matching 2017-06-28 19:43:59 -05:00
tritoch
9f896e1a0a Forget me not 2017-06-28 16:09:42 -05:00
tritoch
e40a7063b8 Handle X in card costs.
Slightly improved split card handling.

Error for Blank mana cost on nonland. If a nonland card with no mana cost (`0` is a mana cost) is printed, adding it to card_corrections will prevent it from appearing in the error log.
2017-06-28 12:27:09 -05:00
tritoch
fe96df161d Full Spoil preparations
Handle both WOTC gallery formats
2017-06-27 12:07:02 -05:00
tritoch
869ba84a19 Remove card exemption (#57)
* Remove card exemption

Mistakenly added thinking it was not a valid card.
2017-06-26 08:44:38 -05:00
tritoch
780a7c7715 WotC Image Gallery Regex Fix (#56)
WOTC changed the format for their card gallery, this will find the images with the new format.
2017-06-23 16:14:47 -05:00
tritoch
ec7270c524 Log Cleanup
* Scryfall debug log off by default

* Scryfall logging disabled by default.

Manual corrections print on one line.
2017-06-23 09:20:40 -05:00
tritoch
be0d113435 Ignore blank images from MTGS (#48) 2017-06-22 12:55:44 -05:00
tritoch
248e09219e More aftermath improvements. (#47)
"layout" now is "aftermath".

"colors" and "colorIdentity" will check both sides of card.
2017-06-22 12:54:59 -05:00
tritoch
55fe07b63b Prevent Print Error from Halt
If a unicode error happens in a debug print, program halted.
2017-06-22 12:19:20 -05:00
tritoch
6d21eb5ca8 Scrape MTGS for images. Fix WOTC URL (#36)
* Scrape MTGS for images

* Fix WOTC card gallery URL.

* Additional Aftermath Fixes
2017-06-21 12:44:39 -05:00
tritoch
be0a51267c Aftermath Handling
Split MTGS-sourced aftermath card text on two newlines instead of three, which will catch both
2017-06-21 10:08:03 -05:00
tritoch
0f2d772821 Return of Corrections (#29) 2017-06-20 12:33:32 -05:00
tritoch
c68da7ccab Types and errorlog improvements (#26)
* Ignore corrected files in Error.
Card Type/Types detection improvement.

* Merge fix

Bad Merge Conflict Resolution
2017-06-20 09:50:16 -05:00
tritoch
47234b17d2 Corrections Post Scrape, Unicode JSON
Corrections should be after all scraping. Output files should be Unicode.
2017-06-19 20:38:12 -05:00
tritoch
4ee8c132eb Improved Split Card Handling. Dump XML to log. 2017-06-19 14:45:32 -05:00