User:Badmachine/wikimedia-research-2014-08-26

MyWikiBiz, Author Your Legacy — Friday April 19, 2024
Jump to navigationJump to search
[14:33:44] <halfak>	 Hey Ironholds 
[14:33:49] <halfak>	 uber analytics meeting!
[15:26:26] <Ironholds>	 R protip of the day: magrittr makes your code 20 times more readable.
[15:32:22] <DarTar>	 hey halfak, let me grab some coffe, brb
[15:32:26] <milimetric>	 Ironholds / DarTar: tnegrin suggested that we sync up a bit about the mobile dashboard stuff
[15:32:28] <halfak>	 Hokay
[15:32:34] <Ironholds>	 milimetric, okie-dokes!
[15:32:36] * halfak runs to get a candy bar
[15:32:48] <Ironholds>	 just lemme know when so I can ty and get on a decent connection
[15:32:57] * Ironholds may go over to the backup hacker collective for that
[15:33:13] <milimetric>	 Ironholds: after lunch east coast?  So like 13:00 EST?
[15:33:30] <milimetric>	 lemme look at your calendars
[15:33:47] <Ironholds>	 oh yeah we're in the same timezone now!
[15:33:49] <Ironholds>	 hi from the north!
[15:34:30] <milimetric>	 :)  time looked clear so I sent the invite
[15:34:38] <milimetric>	 feel free to reject - I'm gonna go get some food
[15:35:08] <Ironholds>	 cool!
[15:35:10] <Ironholds>	 ditto
[16:03:50] <DarTar>	 hey leila 
[16:04:01] <DarTar>	 for some reason I can’t respond via DM
[16:04:09] <DarTar>	 IRC tells me you’re not online
[16:04:23] <leila>	 mmm
[16:04:27] <leila>	 can you see here?
[16:04:27] <DarTar>	 anyway, if you read me, feel free to respond to that thread
[16:04:29] <DarTar>	 yes
[16:04:39] <leila>	 okay. I'll respond. thanks!
[16:05:01] <leila>	 (and I'll try to join for StandUp, if the connection cooperates. It's lunch time here, so I may be free.)
[16:09:31] <DarTar>	 leila: sounds good
[16:28:19] <halfak>	 DarTar: I've got 5 minutes if you do. 
[16:32:21] <leila>	 DarTar, halfak, are you in Hangout?
[16:32:30] <halfak>	 Will be in a minute. 
[16:32:33] <leila>	 got it
[16:32:37] <DarTar>	 coming
[16:33:59] <DarTar>	 Ironholds: yt?
[16:34:03] <DarTar>	 standup
[17:37:48] <Ironholds>	 darnit
[17:37:58] <Ironholds>	 I have an R problem in a class Leila is fricking great at and she's away :(
[17:58:14] <J-Mo>	 ping yuvipanda
[17:58:42] * yuvipanda pings J-Mo
[17:59:09] <J-Mo>	 :) how close are you to implementing CSV download in Quarry? I was hoping to use that functionality in my webinar tomorrow, if it was available.
[17:59:18] <yuvipanda>	 'sup
[17:59:50] <J-Mo>	 if not, I can use Wikimetrics instead. 
[18:00:34] <yuvipanda>	 J-Mo: oh, can do by tomorrow, sure
[18:00:36] <J-Mo>	 basically, I want to guide users through the process of grabbing a CSV dataset from the slaveDB, and manipulating those data with Python, + adding in some related data from the API
[18:00:38] <yuvipanda>	 J-Mo: let me hook it up
[18:00:50] <yuvipanda>	 should be done in a couple of hours
[18:01:01] <milimetric>	 yuvipanda: let me know if you get stuck on anything with CSV downloads
[18:01:04] <J-Mo>	 that okay? I don't want to throw an emergency deadline at you!
[18:01:06] <milimetric>	 I can help
[18:01:24] * J-Mo thanks milimetric and yuvipanda
[18:01:25] <yuvipanda>	 J-Mo: nah, it's trivial.
[18:01:35] <yuvipanda>	 milimetric: \o/ will poke if needed
[18:02:06] <yuvipanda>	 J-Mo: there's a recurring bug I can't track down yet tho - if a query is 'queued' for more than 10s, please ask people to hit 'submit' again
[18:02:10] <J-Mo>	 cool cool! ping me if you need my help (for some reason), or if you have questions, etc
[18:02:21] <J-Mo>	 will do, Yuvi
[18:03:03] <yuvipanda>	 J-Mo: cool :) also note that with python's default csv module, it barfs on Unicode CSVs, and you need to install the unicodecsv module
[18:03:15] <yuvipanda>	 that'll trip people up if they're not using englishwiki for stats
[18:05:01] <J-Mo>	 good catch. I'll work with that module (I'm making test scripts for people to manipulate, rather than having them write Python from scratch).
[18:05:23] <yuvipanda>	 cool
[18:44:17] <DarTar>	 Ironholds, yt?
[18:44:43] <Ironholds>	 DarTar, yep!
[18:45:01] <DarTar>	 so I had a thought about the referral stuff while I was in the shower
[18:45:07] <DarTar>	 (be very afraid)
[18:46:20] <DarTar>	 in the context of referred traffic we’ve been talking about PVs and deprioritized UC/UV because the sampled data is not the best source to answer that question
[18:46:31] <Ironholds>	 DarTar, yup
[18:46:33] <DarTar>	 but I thought, what about unique articles?
[18:46:38] * Ironholds thinks
[18:47:02] <Ironholds>	 that would be programmatically problematic to implement as part of the same dataset. But I could do it alongside.
[18:47:04] <DarTar>	 I think it would be fascinating to determine the breadth of traffic we get from referred vs organic traffic
[18:47:21] <Ironholds>	 I could even build it out as an inequality coefficient or something
[18:47:22] <DarTar>	 yeah, so it’s not as high a priority as simple PV counts
[18:47:35] <Ironholds>	 "here is the coefficient for referred traffic over time, here is the coefficient for organic"
[18:47:42] <Ironholds>	 I can make it into a nice little animated visualisation
[18:47:47] <DarTar>	 yeah, so I was imagining that probably some segments of traffic by referral are really focused on a small subset of articles
[18:47:49] <Ironholds>	 I've wanted an excuse to play with ggvis for a while.
[18:47:56] <DarTar>	 oh god, I take that back
[18:47:59] <DarTar>	 :p
[18:48:07] <Ironholds>	 what?
[18:48:14] * DarTar kidding
[18:48:15] <Ironholds>	 y u no liek animated gifs?
[18:48:37] <DarTar>	 I just read in the changelog for the latest Dropbox iOS app:
[18:48:51] <DarTar>	 “better support for high resolution animated GIFs”
[18:48:56] <DarTar>	 WTF
[18:49:38] <DarTar>	 anyway, what do you think about unique articles as a secondary metric to look into on a longitudinal basis? I think we would discover a lot of interesting things
[18:50:11] <Ironholds>	 sure. Do you want the specific articles, or the number, or the inequality?
[18:50:21] <Ironholds>	 like I said, I could have it save a list of coordinates for coefficients.
[18:50:54] <DarTar>	 brb
[18:59:06] <DarTar>	 oh sorry, by longitudinal I meant “historical”, not by geography
[18:59:40] <DarTar>	 (unless I misunderstand what you mean by coordinates)
[19:00:11] <DarTar>	 hey, I have to find Abbey for lunch, bbl
[19:15:03] <Ironholds>	 DarTar, nono, I know ;p
[19:15:10] <Ironholds>	 I meant coordinates in the sense of coordinate plotting
[19:15:25] <Ironholds>	 i.e., I give you a list, each element of which is a set of coordinates R can interpret as forming a geni coefficient in plot()
[22:34:54] <DarTar>	 yuvipanda: you should really use my “fleur de yuvi” screenshot as your Trello avatar
[22:35:02] <yuvipanda>	 DarTar: :D
[22:35:05] <yuvipanda>	 DarTar: I could
[22:35:29] <DarTar>	 I can’t talk to people who show up as YP on a grey background on Trello
[22:35:32] <yuvipanda>	 DarTar: I hung out with  qchris today, and he was surprised to see me in the same outfit ;)
[22:35:33] <yuvipanda>	 hehe
[22:36:58] <DarTar>	 :D
[22:36:59] <DarTar>	 it’s the famous scottish draight of 2014
[22:36:59] <DarTar>	 draught
[22:37:00] <DarTar>	 aaarg drought
[22:37:00] <DarTar>	 seriously
[22:46:10] <yuvipanda>	 DarTar: :D
[22:46:37] <DarTar>	 Tuesday afternoon dyslexia :-/
[22:47:53] <yuvipanda>	 J-Mo: CSV and TSV download implemented, I just need to add a button now
[22:48:06] <J-Mo>	 sweet
[22:48:11] <J-Mo>	 thanks dude
[22:49:58] <yuvipanda>	 J-Mo: \o/. When's the webinar?
[22:51:21] <J-Mo>	 1500 UTC Wednesday. Same day/time as last week.
[22:54:15] <yuvipanda>	 J-Mo: cool, I should be around
[22:54:51] <J-Mo>	 double sweet. I'll be directing people to this chan if they have questions, as usual.
[22:56:08] <yuvipanda>	 J-Mo: cool
[22:56:22] <yuvipanda>	 now to add buttons
[23:18:52] <yuvipanda>	 J-Mo: so the download buttons themselves aren't done yet, will happen tomorrow
[23:18:59] <yuvipanda>	 I've most of the code done but am being dragged to sleep
[23:20:00] <yuvipanda>	 J-Mo: csv output does work tho http://quarry.wmflabs.org/run/988/output/0/csv (or http://quarry.wmflabs.org/run/988/output/0/tsv since Ironholds hates CSV)
[23:20:02] <J-Mo>	 no problem, YuviPanda. Send me a quick email if for some reason you can't finish them
[23:20:06] <yuvipanda>	 I'll just need to add buttons
[23:20:08] <yuvipanda>	 J-Mo: will do
[23:20:08] <J-Mo>	 oh@ good.
[23:20:11] <yuvipanda>	 thanks!
[23:20:18] <J-Mo>	 nice to have a backup. Goodnight!
[23:20:25] <yuvipanda>	 J-Mo: you can also download JSON, btw http://quarry.wmflabs.org/run/988/output/0/json
[23:21:03] <yuvipanda>	 anyway, am off! cya
[23:25:10] <DarTar>	 Ironholds: I was reviewing https://meta.wikimedia.org/wiki/Research:Mobile_trends and I noticed that you added a note on filtering bots for edits, but there’s no mention of filtering crawlers from the traffic dataset
[23:25:34] <DarTar>	 I know you’ve done extensive filtering of bots, I think we should have this in the Data section of that report
[23:25:49] <Ironholds>	 DarTar, yes! shall do.
[23:25:56] <DarTar>	 danke
[23:27:07] <Ironholds>	 DarTar, done!
[23:27:23] <DarTar>	 sweet thanks
[23:29:43] <Ironholds>	 DarTar, also, JFYI, I don't know how around I'll be tomorrow
[23:29:58] <Ironholds>	 on account of I'm staying up all night to monitor the referer scripts and work on the apps stuff
[23:30:13] <Ironholds>	 because I live on a couch and don't exactly have anything to do until the 31st
[23:30:15] <Ironholds>	 ;p
[23:30:41] <DarTar>	 sounds good to me
[23:31:11] <DarTar>	 make sure you let da boss know
[23:35:59] <YuviPanda>	 J-Mo: yay, got a few mins back. 
[23:36:04] <YuviPanda>	 J-Mo: and download buttons in place! http://quarry.wmflabs.org/query/354
[23:36:39] <Ironholds>	 DarTar, totes
[23:36:44] <Ironholds>	 okay: noms!
[23:36:51] <Ironholds>	 I'm being taken to something called tasty burger(?)
[23:37:39] <DarTar>	 this being Boston, I wouldn’t get worried 
[23:37:48] <DarTar>	 YuviPanda: nice stuff
[23:38:30] <J-Mo>	 YuviPanda: hmm, I'm not seeing the download link?
[23:38:53] <J-Mo>	 where on the page is it 'sposed to show up? 
[23:38:54] <YuviPanda>	 J-Mo: right above the table to the right. do a hard refresh (ctrl+f5 or cmd+shift+r)
[23:39:31] <J-Mo>	 there it is!
[23:39:57] <J-Mo>	 THAT IS JUST BEAUTIFUL
[23:40:02] <YuviPanda>	 J-Mo: :D
[23:40:05] * J-Mo weeps from happiness
[23:40:19] <DarTar>	 future feature request: add the full query to a title element of each row in the Recent Queries table so you can quickly preview it on hovering
[23:40:21] <J-Mo>	 thanks for coming through in the clinch once again
[23:40:28] <YuviPanda>	 J-Mo: \o/
[23:40:41] <J-Mo>	 DarTar: https://trello.com/b/fdwhYLns/quarry
[23:40:50] <J-Mo>	 (for your feature requests)
[23:40:58] <DarTar>	 kewl
[23:42:29] <DarTar>	 done: https://trello.com/c/HyZSUsmU
[23:42:54] <YuviPanda>	 J-Mo: I'll add a feature tomorrow that always downloads the latest successful run results, and then CORS headers, so people can preview things
[23:42:58] <YuviPanda>	 err
[23:43:04] <YuviPanda>	 people can write scripts and shit
[23:43:06] <YuviPanda>	 I meant :)
[23:43:10] <YuviPanda>	 I'm off now!
[23:43:16] <DarTar>	 good night
[23:44:02] <J-Mo>	 night!
[23:55:05] <YuviPanda|zzz>	 J-Mo: oh well, looks like I've some more time :) anything else you want?
[23:55:19] <J-Mo>	 nope. I'm good! thanks, though
[23:55:21] <YuviPanda|zzz>	 J-Mo: also if you're teaching people python, I highly reccomend you ask them to use JSON than CSV/TSV. 
[23:55:22] <J-Mo>	 GO TO SLEEP!!!
[23:55:32] <YuviPanda|zzz>	 It's only midnight!
[23:55:40] <J-Mo>	 hehe. touche.
[23:55:43] <YuviPanda|zzz>	 J-Mo: JSON also has no unicode problems and doesn't need an extra library
[23:55:48] <YuviPanda|zzz>	 oh, it's actuall 1 AM
[23:56:04] <YuviPanda|zzz>	 well, my girlfriend got sucked into a wordpress loop, so I've time until she realizes it's way past 5mins
[23:56:10] <J-Mo>	 re: JSON: I'm going to teach people what JSON is, but will also have thme manipulating CSVs, since a lot of people are more comfortable exploring data in spreadsheets
[23:58:48] <YuviPanda|zzz>	 J-Mo: ah, cool. don't forget unicodecsv then, since the error message it gives otherwise is super confusing - "'ascii' codec can not decode ordinal '0xfe' at position 7' or something like that
[23:59:08] <J-Mo>	 yeah, I'm intimately familiar with that cryptic message :)
[23:59:31] <YuviPanda|zzz>	 J-Mo: :D
[23:59:54] * YuviPanda|zzz channels inner halfak
[23:59:58] <YuviPanda|zzz>	 WE SHOULD ALL USE PYTHON3