D.Analysis
-
Slacker007 wrote:
I don't see C# anywhere.
I'm rather disappointed FORTRAN isn't on that list.
Jeremy Falcon
Go, Forth, and add it.
-
I'm not a fan of Python, but when it comes to big data it's extremely popular. So, you'll find a lot of tools, online docs, etc. to work with.
Jeremy Falcon
Jeremy Falcon wrote:
extremely popular
Popularity does not imply suitability. Python itself can't do very much and any heavy lifting has to be done in a more powerful language.
-
Depends on the "data" and the objective. I would argue that Excel and MS Access are adequate for a lot of situations. "Analysis" could mean simply coming up with some totals (i.e SQL).
"Before entering on an understanding, I have meditated for a long time, and have foreseen what might happen. It is not genius which reveals to me suddenly, secretly, what I have to say or to do in a circumstance unexpected by other people; it is reflection, it is meditation." - Napoleon I
That is definitely a good point. Because I've been doing a lot of ETL work over the past ten years, I usually think in terms of analyzing a new data source to determine what datatypes and data quality checking will be required, not analyzing data to find trends and such. I did have to do some of the latter to determine trends in log files -- record count growth and such. But never on the actual incoming data itself.
-
Jeremy Falcon wrote:
extremely popular
Popularity does not imply suitability. Python itself can't do very much and any heavy lifting has to be done in a more powerful language.
PIEBALDconsult wrote:
Popularity does not imply suitability.
Yes and no. You gotta look at from the n00b's standpoint. Popularity does imply there are more libraries available for it that would be useful or suitable. And it implies it would be easier to learn, with more resources available. Even if say the language took like 2 more lines per concept to code or whatever. There's usually more than one factor to consider.
Jeremy Falcon
-
Not when one has an employer who won't allow downloading packages of any kind. Everything has to be vetted by teams which have no clue before it can be used. By which time it's obsolete of course.
-
Go, Forth, and add it.
-
That could significantly reduce the viability of some popular languages that thrive on the gazillion of freely available libraries.
trønderen wrote:
some popular languages
...which shall remain nameless, of course!
The difficult we do right away... ...the impossible takes slightly longer.
-
more likely, embed thyself
-
what are the programming languages best linked with data analysis?
So, what is D.Analysis? How do we answer this question with zero parameters? For starters, what is the data being stored in? Personally, I think the DB is the first thing to consider. I've done some pretty heavy lifting using PostgreSQL and a few lines of C#. I'm out of touch with SQL server but I'd guess it has the same or more functionality. I'd need to know the size of the data in both width and record count, as well as the ultimate goal of the analysis. I can't pre-spec a language with the question asked. If I had to, I'd go with the language you are most proficient with as learning a new language is probably not practical in the real world.
-
what are the programming languages best linked with data analysis?
"Best" is quite subjective. Even applying the scientific method to get an answer is likely to yield multiple top results with differences to small to matter. When asked within the whole of the software development life cycle (SDLC), there are other considerations in determining what is best for you, your team, and your project. What language that you know, or can easily learn, has an efficient and relatively simple, repeatable, and programmatically configurable library for data access? Of the answers to that question, which libraries offer (in the context of the SDLC and your project) the blend of simplicity, scalability, performance, and supportability? For me, since I use .NET and C#, I use the SqlClient library for whatever DB I am working with, wrapped in a simple, straightforward data access layer that with transactional support and parameters to avoid SQL injection. I do not use Entity Framework for production apps, since it tends to have higher support costs as a project evolves after the first production release, can generate some awful SQL, and does not scale well, besides being slower for "real world" CRUD uses. Other developers will have other preferences, based on what they know, what kind of projects they use, and level of experience in a broad range of projects with the tools they select. The right answer is what works best for the developers on the project to deliver a production app that works, that scales, that is reliable, and that has the lowest SDLC cost for updates and extensions as the app matures and changes over time. I know that is not a simple answer, but our discipline is not a simple one, nor one with which we can be successful by using "cookie cutter" designs or following "recipes" as if we were simply assembling widgets. To get to your question, having obtained the data, you can apply the same principles to what you use to analyze the data.
-
Although I hate python because of it's lazy syntax (white space as part of language), I love it because of lybraries like numpy which allows you easily apply 'array'- operations over matrices and vectors
-
Depends on the "data" and the objective. I would argue that Excel and MS Access are adequate for a lot of situations. "Analysis" could mean simply coming up with some totals (i.e SQL).
"Before entering on an understanding, I have meditated for a long time, and have foreseen what might happen. It is not genius which reveals to me suddenly, secretly, what I have to say or to do in a circumstance unexpected by other people; it is reflection, it is meditation." - Napoleon I