Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. Database & SysAdmin
  3. Database
  4. What is the right database technology for this simple outlined BI tool use case?

What is the right database technology for this simple outlined BI tool use case?

Scheduled Pinned Locked Moved Database
businessquestiondatabasevisual-studiodesign
5 Posts 3 Posters 8 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • U Offline
    U Offline
    User 14037628
    wrote on last edited by
    #1

    Reaching out to the community to pressure test our internal thinking.

    We are building a simplified business intelligence platform that will aggregate metrics (i.e. traffic, backlinks) and text list (i.e search keywords, used technologies) from several data providers.

    The data will be somewhat loosely structured and may change over time with vendors potentially changing their response formats.

    Data volume may be long term 100,000 rows x 25 input vectors.

    Data would be updated and read continuously but not at massive concurrent volume.

    We'd expect to need to do some ETL transformations on the gathered data from partners along the way to the UI (e.g show trending information over the past five captured data points).

    We'd want to archive every single data snapshot (i.e. version it) vs just storing the most current data point.

    The persistence technology should be readily available through AWS.

    Our assumption is our requirements lend themselves best towards DynamoDB (vs Amazon Neptune or Redshift or Aurora).

    Is that fair to assume? Are there any other questions / information I can provide to elicit input from this community?

    L M 2 Replies Last reply
    0
    • U User 14037628

      Reaching out to the community to pressure test our internal thinking.

      We are building a simplified business intelligence platform that will aggregate metrics (i.e. traffic, backlinks) and text list (i.e search keywords, used technologies) from several data providers.

      The data will be somewhat loosely structured and may change over time with vendors potentially changing their response formats.

      Data volume may be long term 100,000 rows x 25 input vectors.

      Data would be updated and read continuously but not at massive concurrent volume.

      We'd expect to need to do some ETL transformations on the gathered data from partners along the way to the UI (e.g show trending information over the past five captured data points).

      We'd want to archive every single data snapshot (i.e. version it) vs just storing the most current data point.

      The persistence technology should be readily available through AWS.

      Our assumption is our requirements lend themselves best towards DynamoDB (vs Amazon Neptune or Redshift or Aurora).

      Is that fair to assume? Are there any other questions / information I can provide to elicit input from this community?

      L Offline
      L Offline
      Lost User
      wrote on last edited by
      #2

      Member 14070096 wrote:

      Is that fair to assume

      No, it is an assumption. Fair would be to evaluate them on their merits, and award points for each merit. My guess is that any NoSQL database would do.

      Member 14070096 wrote:

      The data will be somewhat loosely structured and may change over time with vendors potentially changing their response formats.

      That's wrong; your format should depend on the data that you want to collect, not on the format of various datasources.

      Bastard Programmer from Hell :suss: If you can't read my code, try converting it here[^] "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.

      M 1 Reply Last reply
      0
      • L Lost User

        Member 14070096 wrote:

        Is that fair to assume

        No, it is an assumption. Fair would be to evaluate them on their merits, and award points for each merit. My guess is that any NoSQL database would do.

        Member 14070096 wrote:

        The data will be somewhat loosely structured and may change over time with vendors potentially changing their response formats.

        That's wrong; your format should depend on the data that you want to collect, not on the format of various datasources.

        Bastard Programmer from Hell :suss: If you can't read my code, try converting it here[^] "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.

        M Offline
        M Offline
        Mycroft Holmes
        wrote on last edited by
        #3

        Why a NoSQL database, I would have thought that a relational DB would serve the purpose better.

        Never underestimate the power of human stupidity - RAH I'm old. I know stuff - JSOP

        L 1 Reply Last reply
        0
        • U User 14037628

          Reaching out to the community to pressure test our internal thinking.

          We are building a simplified business intelligence platform that will aggregate metrics (i.e. traffic, backlinks) and text list (i.e search keywords, used technologies) from several data providers.

          The data will be somewhat loosely structured and may change over time with vendors potentially changing their response formats.

          Data volume may be long term 100,000 rows x 25 input vectors.

          Data would be updated and read continuously but not at massive concurrent volume.

          We'd expect to need to do some ETL transformations on the gathered data from partners along the way to the UI (e.g show trending information over the past five captured data points).

          We'd want to archive every single data snapshot (i.e. version it) vs just storing the most current data point.

          The persistence technology should be readily available through AWS.

          Our assumption is our requirements lend themselves best towards DynamoDB (vs Amazon Neptune or Redshift or Aurora).

          Is that fair to assume? Are there any other questions / information I can provide to elicit input from this community?

          M Offline
          M Offline
          Mycroft Holmes
          wrote on last edited by
          #4

          You will HAVE to have an ETL layer between your various sources and your database (assuming it is a relational DB). You need to get all your sources into a single format and deal with changing source structures which will need recoding the ETL to suit.

          Never underestimate the power of human stupidity - RAH I'm old. I know stuff - JSOP

          1 Reply Last reply
          0
          • M Mycroft Holmes

            Why a NoSQL database, I would have thought that a relational DB would serve the purpose better.

            Never underestimate the power of human stupidity - RAH I'm old. I know stuff - JSOP

            L Offline
            L Offline
            Lost User
            wrote on last edited by
            #5

            Mycroft Holmes wrote:

            Why a NoSQL database

            Good question; his example of Dynamo is, but..

            Mycroft Holmes wrote:

            I would have thought that a relational DB would serve the purpose better.

            ..is probably true :thumbsup:

            Bastard Programmer from Hell :suss: If you can't read my code, try converting it here[^] "If you just follow the bacon Eddy, wherever it leads you, then you won't have to think about politics." -- Some Bell.

            1 Reply Last reply
            0
            Reply
            • Reply as topic
            Log in to reply
            • Oldest to Newest
            • Newest to Oldest
            • Most Votes


            • Login

            • Don't have an account? Register

            • Login or register to search.
            • First post
              Last post
            0
            • Categories
            • Recent
            • Tags
            • Popular
            • World
            • Users
            • Groups