Multiply two complex numbers z and c is equivalent to taking z and applying a rotation and dilation to it, the rotation through arg(c) and the dilation through |c|. Division is the inverse of both, so z/c is z rotated by -arg(c) and dilated by 1/|c|.
What you're looking at, then, is taking the operation defined by c (rotate by -arg(c) and dilate by 1/|c|) and asking, if you take the Gaussian integers as the vertexes of a directed graph, what fraction of the vertexes are the source of an edge.
Consider the 1 dimensional analogy using real numbers. Given some real number c, take all the integers as the vertexes of a graph, and if z/c (for some integer z) is also an integer, I put a directed edge from z to z/c. When are these connected? Well, if c is irrational, never. If c is rational, then there will be an infinite number of connections, but how infinite? When c is 2, there will be twice as many edges on average in any subset of the source vertexes as when c is 4. If we can write c as p/q, then the smaller p is, the more edges we'll get, and the brighter the pixel in your image.
The 1 dimensional analogy will have a spike at 1/2, smaller spikes at 1/3 and 2/3, yet smaller spikes at 1/4 and 3/4, smaller ones yet at 1/5, 2/5, 3/5, and 4/5, etc. The spikes will all be distinct (because between any two rationals there is an irrational), but will be infinitely close (because the rationals are dense in the reals). As you keep zooming in, you will get more and more edges like this.
What you're seeing is a variation on the classical structure of the rationals dense within the reals.
Now, a fractal is a set with a fractional Hausdorff dimension. We have to extract a set from your function of c in order to talk about fractal dimension. We could take the support of the function (everywhere it's not zero). In the one dimensional case, that's the rationals. We could take level sets farther up (the set of c such that f(c) = k, for a constant k). Those are subsets of the rationals. However, the rationals, while dense in the reals, are of measure zero in the reals, and have Hausdorff dimension zero, and so do all the level sets. So it's not a fractal.
Doesn't make it any less pretty though.
Comments with the highest number of points:
Stories with the highest number of points:
Users with the highest karma:
"Ask HN:" stories with the highest number of points:
"Show HN:" stories with the highest number of points:
* Conway's Game of Life, using floating point values instead of integers (jwz.org) * Show HN: We open sourced Lockitron's crowdfunding app (selfstarter.us) * 37signals Earns Millions Each Year. Its CEO's Model? His Cleaning Lady (fastcompany.com) * Why is processing a sorted array faster than an unsorted array? (stackoverflow.com) * I Have 50 Dollars (ihave50dollars.com) * Why was a scam company able to raise $76 Million Series B? * The Five Stages of Hosting (blog.pinboard.in) * Where has all the money in the world gone? (reddit.com) * If Software Is Eating The World, Why Don't Coders Get Any Respect? * Hit men, click whores, and paid apologists: Welcome to the Silicon Cesspool (realdanlyons.com)
I used a threshhold of 35 at first, then I upped it to 50. These days you probably want 75 or 100 if you really want to sift through the less important items.
Or you can just subscribe to daemonology's RSS: http://www.daemonology.net/hn-daily/index.rss. This summarizes the top ten every day. These are most of the important articles; of course you will miss some important things.
On the other side of the scale, you should try http://news.ycombinator.com/newest every once in a while as well. The fact is, the HN point system ends up filtering a lot of things by whatever's popular or in sync with the groupthink at the moment. Newest can help you avoid this "filter."
I wish there was a browser extension that only showed highly-ranked (or sufficiently new, maybe) comments, too. It would be useful on the comment threads I find most interesting, but probably be too much of a detriment on others to actually be implemented.
Wait... did I just advocate turning HN into Slashdot?
I had a suggestion: This feature gives you stories 'above' a certain threshold. I think it would help a lot if there was an option for stories 'below' certain threshold.
Rationale behind it is that lots of folks here (like me) don't check the New Submissions section regularly and lots of good stuff never gets much love. If there was an option for generating a front page for stories below certain points, a user can review them and upvote worthy ones, which eventually can feature on normal front page. Any opinions?
...it's an interesting one though. Justifies sticking with HN as your news outlet.
Another problem I have is that I keep scanning the front page list of references to see if a new article popped in somewhere in the list between the last time I checked.
This is inefficient. My impression is that providing a list of references sorted by the time they reached the threshold would do the trick. Though, this would require a significant amount of work to produce. Being able to precompute the sorted list and share it with many users would allow to cache it.
So I raise the question if it is not preferable to propose predefined threshold values. How much difference would it make to have a threshold at 55 and 54 anyway.
Let say you propose a treshold at 10, 25, 50, 75 and 100 for example, the pages could be precomputed and cached.
A script on the browser side could keep track of the last references seen and show older articles in gray for instance.
For one thing the XboX is a very closed platform run by Microsoft and there is no lack of violent games available there. I imagine this part of the agreement is simply an oversight that will be corrected in one form or another.
The only other possible way that this could play out if MS stubbornly don't want to allow adult content on computers is that anybody who wants to play games will switch to another platform, be it Android or Ubuntu or whatever.
The most likely dangerous thing that this could do for game developers would be if MS took control of launch dates for third party games so that they didn't clash with games they planned to heavily promote themselves or with preferred partners.
So maybe if you're an Indie developer you can't launch a game during the Xmas period if Halo 6 is due to come out or something like that.
If that would happen, GNU/Linux would be the biggest remaining open (biggest free it is already) operating system. That would be a good thing. If Microsoft really tries to control Windows-Apps that much that they ban popular games, they probably will kill Windows and games will adapt to Linux. Fine with me.
But it doesn't have to come that way. The comparison to DOS is probably flawed. The new UI is, as far as i understood without being a windows developer, just an UI (with maybe a new API). The classical desktop is not like DOS an operating system Windows has to evolve from. Though it's quite possible that they might try to kill it sometime, it is not the same technical cause like the move away from DOS.
Anyway, before declaring Windows dead, let's wait how well Windows 8 sells and how many will use the new UI and the Windows store.
I imagine that MSFT will have some kind of enterprise program where you can run your own Windows Store Server to do enterprise deploys of Metro apps.
Either way, it looks like Linux might end up the one place where you can install and run your own software over the long haul.
I keep reading that windows 8 distribution is limited to the ms store for apps with metro UI , but that doesn't seem to be the case from my limited experience.
Can anyone clarify?
Everyone is gushing about the post-pc era, Eric Schmidt announced a few days ago that Microsoft is irrelevant, and the article takes for granted that Microsoft and PC as we know it will survive for next 20 years, and Windows will be dominant platform on PC?
If history teach us anything, it's that there's always some solution for the problem. If Windows 8 marketplace turns to be too restrictive, game developers will turn to Steam on Linux. And with enough gamers on Linux, Asus or Gigabyte will not be pressed to make only "compatible with Windows 8" UEFI-locked motherboards.
Or the consumers (and the gamers) will find Win8 marketplace acceptable. Or maybe in five or ten years some other player will sweep the market.
Looking at the current technology and lamenting about the end of the world is just plain stupid.
Between this and UEFI secure boot these guys are really messing up the PC ecosystem.
(happens to be much more readable there, too)
But the big take-away innovation here, and the center of their marketing campaign, is that it has a keyboard cover.
This seems to be about 5 years out of sync. 5 years ago, everyone expected the iPhone to fail because it didn't have a physical keyboard like the blackberry.
But after I got my first iPad, the original, I found that I could type at nearly the same speed (possibly faster due to autocorrect) on its on screen keyboard as I can at a regular keyboard... my finger just go to the place the key is, and while feeling a physical key would be nice, the end result on the iPad was about the same speed.
I think this will sell well into markets that are heavily invested in microsoft infrastructure... but I don't see how it is going to take marketshare from the iPad.
"tablet market"? nope - already dominated by iPad
"enterprise"? nope - 1) not a market, 2) already dominated by iPad
"budget"? nope - priced on par with iPad
seriously, how are they going to push this thing? they don't have a channel like Apple nor Amazon, so they need to rely on all their other paths.
they should've called this "XPad", made it a mobile Xbox, and sold through that channel - just like they did with Kinect.
fail fail fail.
Companies I've worked for, Virgin Games (run by a British immigrant), Shiny Entertainment (also run by a British immigrant and seemed like > 50% foreigners). Naughty Dog had at least 5 or 6 foreigners of 30 employees when I was there. The owner even made a point of showing us the first hire's salary and asking us if we knew any locals that were qualified for the job that wanted the job so that he was sure he was above the law.
A few years later that hire went on to co-found Ready At Dawn which was founded by 3 immigrants. I have no idea how many of their employees are foreign.
I have seen one company, Interplay, abuse their power over a foreign employee by threatening to pull their visa support for him before if he didn't do X (I think X was stop some un-work related outside music activity). He ended up marrying his local GF and then got the hell out of that company.
The company I currently work for, as far as can tell, has a similar position. We'll hire anyone that applies that can convince us they can do the job. Finding them is hard. There may be tons of qualified people but either they aren't applying, they can't write a resume that makes it look like they're qualified, or they can't convince us in the interview that they are qualified.
I worked for IBM for almost 12 months before I got out of undergrad. There were 4 hiring managers and a senior VP in a hardware unit who was personally vouching for me. Yet, IBM didn't hire me because they were ridiculously careful with hiring H1-B visas because of a snafu they had in the early 80s when the immigration dept cracked down on them. At least IBMs engineering divisions were only hiring H1B folks only if you had 2 years of experience with a BS, or a Masters. I was extremely pissed at the time because I felt like I had more credibility based on merit and my time at IBM than many other interns who were getting offers left and right after spending most of their time playing counterstrike in the labs. They were being ridiculously paranoid about sticking with the books on this one. And sure enough, after I got my Master's I did have and continue to be able to get offers from IBM.
Like some folks said, this probably has to do with third parties who place folks at IBM. Keep in mind, I'm not trying to support them (I'm still sour about my undergrad days), but I'm just refuting the whole point about IBM hiring H1B folks just to save money. Just ask around your H1B friends, I'm sure most of you have many. The only ones that I've heard of doing shady things with H1B candidates are small consulting shops and third party staffers.
If your job is to deliver products, you'll find you get what you pay for, I work with some guys who maybe make 10k less than they could because of their immigration status but it's not a difference between 150k and 60k or something.
I feel that the only way to stop this is to take away the economic incentive to hire foreign workers. If there was a tax that required the difference between market for the position to be paid to state and federal government we would solve this problem overnight.
I also think part of this problem is the general attitude that business has that everyone is generally a replaceable cog in a machine. Don't get me wrong I have meet and worked with H1-B holders that were superstars, but many of them are not perfect candidates and end up costing more in productivity and efficiency then a Grade A local engineer. A ninja developer can be 20x more productive then someone who is not if you believe the hype, I think companies should focus on getting the right people then just thinking about the people who are cheap. While the ninja might still be an H1-B, we should have a level playing field where the best guy wins. Its better for the company overall, but unfortunately many people are too shortsighted to see that.
http://www.wto.org/english/thewto_e/whatis_e/tif_e/fact2_e.h...2. National treatment: Treating foreigners and locals equally Imported and locally-produced goods should be treated equally â€" at least after the foreign goods have entered the market. The same should apply to foreign and domestic services, and to foreign and local trademarks, copyrights and patents. This principle of â€śnational treatmentâ€ť (giving others the same treatment as one's own nationals) is also found in all the three main WTO agreements (Article 3 of GATT, Article 17 of GATS and Article 3 of TRIPS), although once again the principle is handled slightly differently in each of these.
National treatment only applies once a product, service or item of intellectual property has entered the market. Therefore, charging customs duty on an import is not a violation of national treatment even if locally-produced products are not charged an equivalent tax.
P.S : I am from India, I have no intention now (or ever before) of emigrating to find better opportunities. So this comment is not borne out of any bitterness. It is out of genuine curiosity to know why this point of view is rarely mentioned in any such debate.
Chemical engineering, to use that example, designs elaborate and complex dynamic systems by chaining together abstract chemical algorithms. Each one of those little algorithms is subject to both patent and copyright. Like with software most of the commonly used algorithms and clever hacks were either never patented or the patents have long expired. It is only on the bleeding edge that some chemical algorithms are under patent; as with computer algorithms there are an unbounded number of potential algorithms but some are more efficient than others. Specific implementations are still covered by copyright and are widely licensed (as libraries).
Most of the nominal specialness attributed to software as a domain for intellectual property does not really exist. Yet the rarely questioned assertion that computer software is special in some way has created a dearth of comparative studies that would likely be valuable from both a theoretical standpoint as well as a practical policy standpoint. Either these other areas, like chemical processes, are equally broken at a fundamental level and the scope should be extended beyond software, or there are differences in implementation across otherwise equivalent domains and we should borrowing from the better implementation. It seems like an oversight that no one is attempting to do either.
I was looking at a face recognition patent that was filed by the US by a Japanese company in 1998 that was finally issued in 2006. Eight years is a very long time in the fast moving software industry -- even if you get lucky and your patent granted in two years, it's quite possible that your invention is obsolete by the time you get your patent.
Given that software is so fast paced, most organizations that expect to be "practicing entities" find the patent application to be a distraction from the task of getting a competitive product in the marketplace. This is a very different situation from other fields where you really can get a patent for a mechanical or electronic thing and then have the patent as a tool for negotiation w/ manufacturers.
And, it's nice work.
<edit>Doh. I stand corrected. Vader did fly a TIE Fighter - technically the "TIE Advanced x1".
Looks like I'm overdue to rewatch SW:ANH. smile</edit>
What isn't mentioned is why SQLite is using a virtual machine in the first place. The reason is that SQLite only calculates the next row of results when you ask - it does not calculate all result rows at once in advance. Each time you ask for the next row of results it has to resume from where it last left off and calculate that next row. The virtual machine is a way of saving state between those calls for the next row (amongst other things).
This is also why there isn't a method in SQLite DB adapters to get the number of result rows. SQLite has no idea other than actually calculating them all which is the same amount of effort as getting all of them.
1- The main cost of third-party software is never the cost of the code; it's the cost of using, integrating, customizing, and gettng support for it. The utility of the raw code itself is often zero. This is why binpress - selling code - never (really) took off, but github - a code community - did.
2- Given no restrictions, the prices people slap onto source code get very ridiculous, very fast. Non-technical people expect well-polished software for $1.99 (see: App Store). Hobbyists and developers often have a case of NIH, and a lot of them think that code should be communal and free (as in beer). The reasons are varied but the end result is that source code (by itself) is not considered a valuable commodity by the market anymore, which means nobody cares about selling theirs - or they try and quickly learn it's not worth it.
3- Licensing. You have a minefield of legal issues of ownership resolve. If you haven't looked into it, you probably don't even realize the extent of the BS that will be thrown at you.
4- I won't sugar coat this. You'll never make any money on a 3% comission of a commodity that's already priced dangerously close to zero by the market (see #2). The costs of dealing with people whining when things go wrong - alone - will exceed your comission.
5- I'm a developer, and I just don't see the value-add here. I have to do my own marketing, I have to do my own sales, I have to write the software, and I have to support it. If I'm going to go through that trouble, why don't I just blast up my own template Stripe page w/download link and cut out the middleman?
I guess what I'm saying is: please don't make my mistakes. Do something different and make different mistakes.
Also, I'm from Waterloo so I understand what it's like to be a tech entrepreneur in Canada. And sadly this means I should underline point #3, which Canada has much worse than the states.
I'm uncertain about the business model - is $0.15 per transaction really going to add up? - but according the Steve Blank, that is the point of a startup anyways. (http://steveblank.com/2012/03/05/search-versus-execute/)
Very nice work!
I primarily use http://hckrnews.com to browse HN submissions ( <-- this site is great. If you don't use it, you're missing a lot IMO ) and its extensions for Safari: http://hckrnews.com/about.html
Both have great ideas, and I think pg should adopt one of them (though I know he won't). I hate it when I use my iPad for browsing HN. Comments are small, up/down-vote arrows are minuscule, and you can't use these highly-useful plugins. And I've tried about a dozen different clients so far. None of them offer anything like http://hckrnews.com a chronological timeline of submissions), so I keep coming back to Safari... :(
Or, even better, built into the site!
Does this fix the bug (which I assume was caused by this) where if a story was marked as read then it sometimes loaded the comments page without the main story link?
Also, the "Follow Comments" functionality wasn't obvious to me until I read this blog post - perhaps a rewording?
Thanks for a great extension!
How many people click through to your consulting page?
How many additional inquiries do you get over the coming 1-2 weeks?
Can I ask, do you think it has advantages over, say, recording your screen while you record yourself taking someone through a demo? Then maybe replacing your voice with a voicebunny voiceover?
I imagine cartoon v screen demo is dependent on your audience and objectives, but intrigued as to why you went this route. Cheers!
Some background at http://digital.cabinetoffice.gov.uk/2012/10/16/directgov-a-q...
Gov.uk has been built by a (relatively) small in-house team, by people who genuinely care about what they are building. They embrace the fact that they are building tools for the good of society rather than just satisfying a contract.
Also as a citizen, I love the fact that I can open a pull request on my government's website ( https://github.com/alphagov/calendars/pull/1 ). We've got the ball rolling in opening up government data on the internet, but this is a great example of how technology can enable citizens to get involved in government.
Look at the most active searches on the site. Top of the list is the JobCentre Plus job search. All gov.uk does is link you to the existing JobCentre Plus website on Directgov.
Similarly, let's say I want to book a driving test (another of the most active searches). All gov.uk does is link me to the existing booking service on Directgov.
Okay, so I want to renew my tax disc (again, another of the most active searches). Again, all gov.uk does is link me to the existing DVLA interface on Directgov.
This goes on and on. The only useful thing gov.uk can do is link me to the existing Directgov sites. Google already does that for me.
I assume the long-term plan is to integrate all these services into the gov.uk site, but one can't help thinking they should have done this â€" and actually made the site a useful port of call in and of itself â€" before shedding the beta status.
Edit: Just noticed the site doesn't properly launch until tomorrow, so I wonder if that will change all of this. If so, you can colour me impressed.
Will direct.gov.uk have a bunch of redirects to gov.uk?
Money well spent!*
* Sarcasm for reference...
Also, you should check out my free app: https://zetabee.com/cashflow/ see the demo and update the dates/amounts to play around). It has similar features and is very much like the original spreadsheet I used myself. It has burn rate (balance), monthly (income vs. expense), and many other views/lists.
My biggest feature request has been multiple simulations/forecasts so users can do what-if scenarios. Also the ability to easily ignore/disable rules with one-click to help with this. Either I get the new car or don't. Check/uncheck to see how it impacts the future. I've been very busy with my other projects and have not updated this app in over two years. I hope you guys can incorporate these features into your app so I can forward the heavy users to you. Good luck!
My suggestion is to take your answers to those questions and use them on the front page to provide a bit more information to hook potential customers like myself.
The homepage is not yet ready for the general public, nor for a "Show HN"!
I'll happily answer questions though.
Here's an internal newsletter archive if you want to have an quick look inside though:
We've been managing this using a spreadsheet but tracking loan repayments, stock purchases, postage, cost of sale and salaries as well as picking a useful scale (daily vs weekly) has been a real challenge.
I can't wait for an invite.
Looking forward to receiving an invite to the beta :)
I used to do this kind of things with a spreadsheet, but this tool automates it and gives you nice charts.
1. Her work eventually killed her. There was no way for the Curies to know what effect long-term radiation exposure would have, but it's still not a huge image booster for her to have died from material handling practices that would be considered idiotic today.
2. She wasn't a "lone wolf". i.e. She did much of her work with her husband Pierre, who shared the Nobel Prize for physics with her. Pierre was an instructor when they met and undoubtedly gave her a huge helping hand right when she needed it. Never mind that she came from a poor background and showed remarkable determination just getting into college in the first place, or that she would later be the sole recipient of a Nobel prize for chemistry!
3. People are still scared of words like "radioactive" and "radiation". Just look at how comfortable people are with coal power that kills thousands every year as a part of normal operation. Then note how those same people freak out when a nuclear plant threatens to give a handful of people cancer, but only after being horribly mismanaged and then hit by an improbable sequence of natural disasters! Arguably, Curie is scary by her association with something people are unreasonably paranoid about.
4. Let's face it, Ada Lovelace was a bit of a looker, or at least she was painted that way. She even had a sexy sounding name. We have real, unromantic photographs of Curie on the other hand the reveal her to be rather plain by comparison, plus her name is now linked with a scary unit of radioactive decay! It's a case of the Belle vs the school-marm.
She's a hero here in France but I don't know about her reputation in the English speaking world.
She was a very amazing person. Born in Poland, she studied secretly for years in her native country, because higher education was not opened to females! Then she moved to Paris to join her sister and continued her education during the day, while tutoring in the evenings to pay for it.
- - -
There was a nice play (in French) that I saw in 1989, called "Les Palmes de monsieur Schutz"
A movie was later made from the play. I didn't see the movie but hear it's not too bad for that kind of adaptation from scene to screen (and it has cameos from Pierre-Gilles de Gennes and Georges Charpak).
The need to celebrate the one symbolic "woman of science" is sexist and crude. The idea of one "woman of science" makes the work of other female scientists seem less important. The idea that the "woman of science" be symbolic makes Marie Curie's work seem symbolic and unimportant.
Why can we not just celebrate both Marie Curie and Ada Lovelace as excellent scientists, leaving gender out of the discussion?
... since we recently had a story about recognizing contributions of polish people: http://www.smh.com.au/world/honour-for-overlooked-poles-who-...
I think it's perfectly legitimate to celebrate Marie Curie but at the same time also celebrate Ada Lovelace, and I don't see how the two are in competition.
If we're going down that route, why are we paying so much attention to Alan Turing, who elected him as the man in science? Aren't there other computer pioneers worth celebrating? Well, yes, but talking about Turing doesn't lessen them.
And, for that matter, why do women have to be their own category at all? Why does "Ada Lovelace Day" automatically mean "Woman Day", does that mean that we should also elect one single man to name a day after?
http://www.orau.org/ptp/collection/quackcures/quackcures.htm (see the bottom of the page)
Most of them seem to originate in Japan.
I keep referring back to this topic more and more these days... http://en.wikipedia.org/wiki/Stiglers_law_of_eponymy
The summary of Lovelace in the OP brought this back to mind as it is exactly what my stance had been when I'd ditched it for the more conservative "well I guess I don't really know". I never looked into it again until now. Does anyone have a recommendation for a balanced modern summary of her contribution?
I believe most of what I had read by Swade was written before she was as popular in the CS community as she now is, and when I raised his points in discussion I was usually told he was widely regarded to hold a grudge against her for some reason no-one really understood.
Another thing about Rwanda that is surely affecting their growth and blows many African countries out of the water is they have wired most of the country with Fiber where most have dial up speeds. It is not yet common residentially, but for global businesses to establish a base in East Africa, this is huge. I lived two hours outside the capital in a mud house with out running water inside, but I did have 1-Mbps download (was near a rural, well-financed hospital, but still, it was faster internet than I had in Boston.) This is largely all due to the government there which is a pretty well-oiled machine with a bit of a benevolent dictator, but one who gets things done for the benefit of the country IMHO. Contrast with where I am currently living in West Africa in Sierra Leone where the infrastructure is dismal and there is no kLab type place anywhere. There is a lot less action in the startup/entrepreneurship scene. Bad infrastructure, a long civil war and countless other things feed into this.
In terms of startups...so much of Africa runs on mobile phones (the majority of small amounts of money is transferred via SMS) and most of what I saw in terms of startups was based around Mobile-Social-Local. Not unlike what you see in the US and elsewhere.
So a scene of technology companies is pretty far fetched.
Don't let the growth rates fool you. Most of it is still related to oil, mining, construction and public works.
Once a startup gets enough success to make relocation accessible, why wouldn't they relocate somewhere easier?
You can't just copy 760 company files which I assume are not public into your private Dropbox account. It may have been a mistake. It might not. Either way some sort of legal action seems inevitable.
These petty fights are getting tiresome.
The software engineering career is in somewhat of a mess right now. It comes down to the "bozo bit" problem. Being a software engineer (even with 10+ years of experience, because there are a lot of engineers who only do low-end work and don't learn much) is not enough to clear the bozo bit, and you won't be able to prove that you're good unless you have a major success, and it's hard to have that kind of success without people already trusting you with the autonomy to do something genuinely excellent.
It's not enough to write code, because LoC is a cost and source code is rarely actually read at the large scale. At least for backend developers, the only work-related (as opposed to political) way to establish that you're worth anything as a engineer is to have an architectural success, but it's very hard to have architectural successes unless you've established yourself as an "architect" to begin with. So there's a permission paradox: you can't do it until you've proven you can, and you can't prove you can do it until you've done it. Hence, the vicious politics that characterize software "architecture" in most companies.
Functional programming is one way to put yourself head-and-shoulders above the FactoryFactory hoipolloi. The problem is that most business people don't understand it. They just think Haskell's a weird language "that no one uses". Elite programmers get that you're elite if you know these languages, but most companies are run by people of mediocre engineering ability (and that's often just fine, from a business standpoint).
It is true that functional programming is superior to FactoryFactory business bullshit, but not well-enough known. Good luck making that case to someone who's been managing Java projects for 10 years. What is better known is that mathematics is hard. It's a barrier to entry. I doubt more than 5% of professional programmers could derive linear regression.
So I see the data science path (and yes, it takes a long time to learn all the components, including statistics, machine learning, and distributed systems) as a mechanism through which a genuinely competent software engineer can say, "I'm good at math, and I can code; therefore, I deserve the most interesting work your company can afford to fund." It's a way to keep getting the best work and avoid falling into that FactoryFactory hoipolloi who stop advancing at 25 and are unemployed by 40.
My experience: I had basically zero math background, but I took ML with Ng and probabilistic graphical models with Koller, and later was a TA for Ng's ML class, during my Masters' degree and thought I was all set to go into machine learning jobs. To my surprise, I consistently found myself in interviews stumped by questions from basic stats, particularly significance testing, which people with more traditional stats backgrounds assume is basic knowledge (and it should be), but which wasn't taught in any of my ML classes.
I'm in a job now that involves some machine learning, but the ML component is 50% marshalling data (formatting, cleaning, moving), 40% trying to figure out how to get enough validated training examples, and 10% thinking about the right classifier to use (which someone else already implemented). Which to be honest is not very interesting.
So yeah, becoming a real data scientist is hard, requires a lot more knowledge than you get in one ML course, even from Andrew Ng, and the reality of the work often doesn't make it some dream career. And the competition for jobs isn't from other people who also just took that course -- it's from PhD statisticians and statistical physicists who might have taken one ML class to show them how to use all the mathematical tools they already have to do the new hot thing called machine learning.
Also, someone who has to constantly shift their attention between statistics and database servers might get less done than somebody who can concentrate on the mathematics and let their co-workers handle the implementation details.
It shouldn't be surprising or bad news that some "data scientists" have deeper knowledge than others. We're going through a quantitative revolution -- many fields and industries are nearly untouched by statistical analysis/machine learning, and so there's a lot of low-hanging fruit in going from "nothing" to "something." Even somebody who only knows a little can add value at these margins. But, of course, that won't be true forever -- look at quantitative finance, which is very competitive and requires a lot of education, because the low-hanging fruit was picked in the 90's.
There's room in this world for the statistician, the mathematician, the database engineer, the AI guy, the data visualization expert, the codemonkey who knows a few ML methods, etc.
This of course devalues your own skills, as you are one of the elite few. Unless you start writing textbooks. Which you should do, if you're one of the few. And self-promote like hell. If that course doesn't cover it - what does? Do you acknowledge that in a few years some shortcuts might be possible - that budding data scientists might not need to hae read every book that you have? I bet if you try, you can make your own shortcut in the form of a book.
Which is followed by a link to my own book on this topic, Agile Data: http://shop.oreilly.com/product/0636920025054.do which attempts to demystify as much as teach.
This probably needs to be clarified a bit to say that Ng's course skipped this. Daphne Koller's "Probabilistic Graphical Models" (running now at Coursera) covers this in great detail.
Minor tweak in an otherwise nice post.
I actually found this list encouraging because the things I don't know well on that list are things I'm working on and am aware are holes in my knowledge.
But in the end the reality will always be that the people who are "real" data scientists will be the people that are actually solving real problems whether or not they can check off every bullet point on a check list.
I'm excited about what the future brings. Many industries has seen the value in data sciences, and Universities are following (see, e.g. Columbia's new data sciences institute).
No one ever claimed that taking one class made someone an expert "data scientist." Instead, that single class wetted Luis, Jure, and Xavier's (the three competition winners) appetites, and pushed them more to learn more about machine learning and natural language processing. They then went on to dive much deeper, and excelled specifically in one area of applied NLP.
However, without that first class, there's a good chance none of them would have ever focused on (or heard of) machine learning. Their story is growing increasingly common. Like the Netflix Prize, Andrew Ng's first Coursera class did its part in shining a spotlight onto our dark little corner of the universe.
I'd be very cautious about a long checklist of items that are necessary to be a successful data scientist (which is a pretty ill-defined and encompassing term at this point). That is a decent summary of many useful tools of the trade, but they are by no means useful for all problem domains. For example, I could spend years working on machine learning for EEG brain-computer interfaces without a good reason to use databases or "big data" NoSQL technologies. I especially enjoyed MSR's take on the matter in "Nobody ever got fired for using Hadoop on a cluster" http://research.microsoft.com/pubs/163083/hotcbp12%20final.p...
When we're hiring data scientists or seeking successful ones, we've found focusing on demonstrated excellence in one relevant area plus general quantitative competencies and the curiosity and tenacity to learn new tools and techniques works far better than a laundry list of skills and experiences.
where's the science? it is, after all, a data scientist role. where is learning to do actual science?
what the world's been describing is an analyst or an engineering position, not science. if you don't know how to ask questions, interpret results, structure experiments - then you don't know science, so quit calling yourself a scientist. science involves a rigor of thinking and doing that has been omitted here.
Sound familiar? The entire team I worked with had a similar workflow, but we went by life science domain specific titles rather than "data scientist." I'm willing to bet that other professions have similar roles, merely called something else.
I think "data scientists" are out there in the sciences. They just don't go by the latest buzzword.
Also, its not clear folks are using consistently the term vis a via scale/scope. Consider an anaolgue of knowledge and expertise (real estate example):
Is Data scientist an Architect? Or the person that builds the building? Is he the guy that does the plumbing? Although the up and coming "quantitative system analyst" probably doesn't quite ring the same tune on a biz card. And most refer to lower level quant mastery, eg. social engineering or quantitative finance, as a "black art" not a science. Without a high level vision, the concept/title seems...grandiose, until you get to very extreme levels of skill. And then it makes sense.
I've been testing Tableau, but it's only for Windows and I'm on a Mac.
I'm looking for something which easily connects to your SQL database and allows you to produce all kinds of fancy graphs, with a easy user-interface. Tableau comes close to what I'm looking for, but my gut feeling is there should be a whole bunch of good commercial and free software in this field out there, but my googling haven't gave any good results yet.
So if anyone have any good suggestions, please let me know.
For any field, one has to provide positive encouragement (and a good platform/set of tools and techniques) to people seeking to get into that field, while being grounded in reality.
working out the question is what makes it a hard(and creative) process, and then you can apply your ML toolbox.
edit: whats different from a data scientist vs analyst/statistican is they build their own tools as the datasets are too massive & non-standard for the usual toolset.
however, what really worries me is that you use sendgrid, but still your confirmation email got to my spam folder.
Does this mean that sendgrid.me (what was used in that case) reputation is down?
I strongly suggest that you get your own email IP, you will have low bounce / unsubscribes anyway as you send only transactional emails
(I'm not affiliated or have any relatives in sendgrid)
After reading through the site though, I am not sure the tone/feeling of your site suits your purpose. Looking for a job is a very serious thing, I wonder if your site is a little too comical for your audience/purpose.
For me, it doesn't have the feel of a site I would want to use for job hunting.
Other than that I really like the low traction.
One thing to consider; I spent about a minute looking on your site (and on Laudits) for information about you and didn't find anything. So you seem kind of anonymous. You could be a bunch of sniggering recruiters, I don't know.
For instance, putting people on multiple projects versus single projects. Multi-tasking and context switching cause known losses in efficiency even under the perfect scenario with no startup time. So if you want to emphasize getting projects completed, multi-tasking must go. This is an operations management mindset, though it runs against much current practice.
The points about efficiency are similarly misguided. You don't focus on worker efficiency, operations points you at throughput and global optimization rather than local optimization.
Those however are about focusing on improvements and operations of the current systems. It is true that innovation is different than operations. In the same way that learning to program efficiently in a language is different than inventing a programming language. They are different kinds of things. Many operations principles however can be successfully applied to innovation. For an excellent read on the topic, read "The Principles of Product Development Flow" by Reinertsen.
To get a better understanding of operations in general, and not the hyperbole advanced in the article a good starting place is "The Goal" by Goldratt.
One big revelation I learned while studying operations is that the things that largely drive developers crazy aren't good management practices. They aren't advanced by operations researchers. They are in fact what I always thought they were, bad management practices.
My goal is stability. The dev group's goal is growth. In a very small startup, it's hard to hold both these ideas in your head at the same time. If you don't grow enough, you will never need stability. If you grow enough, you will eventually need stability so that your customers do not desert you. Thesis, antithesis, synthesis: supportable growth.
My ops group meets with devs for discussions of supportability and performance well before the actual handoff. We make sure that monitoring is allowed for from the beginning, that there are processes and documentation that let us fix problems at 3AM or diagnose it well enough to call the right person to fix it.Performance is checked regularly. Is it time for new hardware? Is it time for a rethink of the architecture to support faster performance or higher scale?
The two forces will be vector-summed. It's important to get it all pointed in a direction and magnitude that leads to success.
The thing that struck me immediately was focus on tools. Process is not about tools. Tools do not solve problems (believe me as you grow they'll cause many problems).
Figure out* process, only then look for or make the tools that you need. Best to start out with the simplest thing that could possibly work. Which for many is whiteboards or some similar low tech solution.
* you (should) never stop evaluating and improving process. One of the best processes to adopt is "iteration" with bits of evaluation and planning in between.