Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. General Programming
  3. Algorithms
  4. Prefix sum: GenomicRangeQuery

Prefix sum: GenomicRangeQuery

Scheduled Pinned Locked Moved Algorithms
tutorialhelpquestionc++delphi
3 Posts 3 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • Y Offline
    Y Offline
    Yugo Amaryl
    wrote on last edited by
    #1

    In its "Prefix sum" section, Codility proposes the following problem. I don't see how to solve it within the constraints, that are O(N+M), and O(N) for space complexity. I know I should use the prefix sum technique, but I would need more hints. I'd like to not be given the solutions, but hints that would help me guide my thought process. -- A DNA sequence can be represented as a string consisting of the letters A, C, G and T, which correspond to the types of successive nucleotides in the sequence. Each nucleotide has an impact factor, which is an integer. Nucleotides of types A, C, G and T have impact factors of 1, 2, 3 and 4, respectively. You are going to answer several queries of the form: What is the minimal impact factor of nucleotides contained in a particular part of the given DNA sequence? The DNA sequence is given as a non-empty string S = S[0]S[1]...S[N-1] consisting of N characters. There are M queries, which are given in non-empty arrays P and Q, each consisting of M integers. The K-th query (0 ≤ K < M) requires you to find the minimal impact factor of nucleotides contained in the DNA sequence between positions P[K] and Q[K] (inclusive). For example, consider string S = CAGCCTA and arrays P, Q such that: P[0] = 2 Q[0] = 4 P[1] = 5 Q[1] = 5 P[2] = 0 Q[2] = 6 The answers to these M = 3 queries are as follows: The part of the DNA between positions 2 and 4 contains nucleotides G and C (twice), whose impact factors are 3 and 2 respectively, so the answer is 2. The part between positions 5 and 5 contains a single nucleotide T, whose impact factor is 4, so the answer is 4. The part between positions 0 and 6 (the whole string) contains all nucleotides, in particular nucleotide A whose impact factor is 1, so the answer is 1. Write a function: def solution(S, P, Q) that, given a non-empty zero-indexed string S consisting of N characters and two non-empty zero-indexed arrays P and Q consisting of M integers, returns an array consisting of M integers specifying the consecutive answers to all queries. The sequence should be returned as: a Results structure (in C), or a vector of integers (in C++), or a Results record (in Pascal), or an array of integers (in any other programming language). For example, given the string S = CAGCCTA and arrays P, Q such that: P[0] = 2 Q[0] = 4 P[1] = 5 Q[1] = 5 P[2] = 0 Q[2] = 6 the function should return the values [2, 4, 1], as expla

    L H 2 Replies Last reply
    0
    • Y Yugo Amaryl

      In its "Prefix sum" section, Codility proposes the following problem. I don't see how to solve it within the constraints, that are O(N+M), and O(N) for space complexity. I know I should use the prefix sum technique, but I would need more hints. I'd like to not be given the solutions, but hints that would help me guide my thought process. -- A DNA sequence can be represented as a string consisting of the letters A, C, G and T, which correspond to the types of successive nucleotides in the sequence. Each nucleotide has an impact factor, which is an integer. Nucleotides of types A, C, G and T have impact factors of 1, 2, 3 and 4, respectively. You are going to answer several queries of the form: What is the minimal impact factor of nucleotides contained in a particular part of the given DNA sequence? The DNA sequence is given as a non-empty string S = S[0]S[1]...S[N-1] consisting of N characters. There are M queries, which are given in non-empty arrays P and Q, each consisting of M integers. The K-th query (0 ≤ K < M) requires you to find the minimal impact factor of nucleotides contained in the DNA sequence between positions P[K] and Q[K] (inclusive). For example, consider string S = CAGCCTA and arrays P, Q such that: P[0] = 2 Q[0] = 4 P[1] = 5 Q[1] = 5 P[2] = 0 Q[2] = 6 The answers to these M = 3 queries are as follows: The part of the DNA between positions 2 and 4 contains nucleotides G and C (twice), whose impact factors are 3 and 2 respectively, so the answer is 2. The part between positions 5 and 5 contains a single nucleotide T, whose impact factor is 4, so the answer is 4. The part between positions 0 and 6 (the whole string) contains all nucleotides, in particular nucleotide A whose impact factor is 1, so the answer is 1. Write a function: def solution(S, P, Q) that, given a non-empty zero-indexed string S consisting of N characters and two non-empty zero-indexed arrays P and Q consisting of M integers, returns an array consisting of M integers specifying the consecutive answers to all queries. The sequence should be returned as: a Results structure (in C), or a vector of integers (in C++), or a Results record (in Pascal), or an array of integers (in any other programming language). For example, given the string S = CAGCCTA and arrays P, Q such that: P[0] = 2 Q[0] = 4 P[1] = 5 Q[1] = 5 P[2] = 0 Q[2] = 6 the function should return the values [2, 4, 1], as expla

      L Offline
      L Offline
      Lost User
      wrote on last edited by
      #2

      S is a string; or an array of characters; characters which "convert" to impact numbers. P and Q are (3) "substring ranges" into S. The subset / substring of S will contain one or more "MIN" impact numbers.

      "(I) am amazed to see myself here rather than there ... now rather than then". ― Blaise Pascal

      1 Reply Last reply
      0
      • Y Yugo Amaryl

        In its "Prefix sum" section, Codility proposes the following problem. I don't see how to solve it within the constraints, that are O(N+M), and O(N) for space complexity. I know I should use the prefix sum technique, but I would need more hints. I'd like to not be given the solutions, but hints that would help me guide my thought process. -- A DNA sequence can be represented as a string consisting of the letters A, C, G and T, which correspond to the types of successive nucleotides in the sequence. Each nucleotide has an impact factor, which is an integer. Nucleotides of types A, C, G and T have impact factors of 1, 2, 3 and 4, respectively. You are going to answer several queries of the form: What is the minimal impact factor of nucleotides contained in a particular part of the given DNA sequence? The DNA sequence is given as a non-empty string S = S[0]S[1]...S[N-1] consisting of N characters. There are M queries, which are given in non-empty arrays P and Q, each consisting of M integers. The K-th query (0 ≤ K < M) requires you to find the minimal impact factor of nucleotides contained in the DNA sequence between positions P[K] and Q[K] (inclusive). For example, consider string S = CAGCCTA and arrays P, Q such that: P[0] = 2 Q[0] = 4 P[1] = 5 Q[1] = 5 P[2] = 0 Q[2] = 6 The answers to these M = 3 queries are as follows: The part of the DNA between positions 2 and 4 contains nucleotides G and C (twice), whose impact factors are 3 and 2 respectively, so the answer is 2. The part between positions 5 and 5 contains a single nucleotide T, whose impact factor is 4, so the answer is 4. The part between positions 0 and 6 (the whole string) contains all nucleotides, in particular nucleotide A whose impact factor is 1, so the answer is 1. Write a function: def solution(S, P, Q) that, given a non-empty zero-indexed string S consisting of N characters and two non-empty zero-indexed arrays P and Q consisting of M integers, returns an array consisting of M integers specifying the consecutive answers to all queries. The sequence should be returned as: a Results structure (in C), or a vector of integers (in C++), or a Results record (in Pascal), or an array of integers (in any other programming language). For example, given the string S = CAGCCTA and arrays P, Q such that: P[0] = 2 Q[0] = 4 P[1] = 5 Q[1] = 5 P[2] = 0 Q[2] = 6 the function should return the values [2, 4, 1], as expla

        H Offline
        H Offline
        honey the codewitch
        wrote on last edited by
        #3

        Wow. There are so many things I love about this question. I'm going to save it. And I'm going to try if not to solve it, then to demonstrate why I can't solve it, using code that consumes some finite computation space or time. I'm mucking about with a self-modifying linear bounded automata interpreter/compiler i've built and i am crap at most math, but just good enough at the right kinds of math to know how to build this problem and it express it through that. And you've asked what may be exactly the question I need. How many times can someone say that? Thank you =)

        1 Reply Last reply
        0
        Reply
        • Reply as topic
        Log in to reply
        • Oldest to Newest
        • Newest to Oldest
        • Most Votes


        • Login

        • Don't have an account? Register

        • Login or register to search.
        • First post
          Last post
        0
        • Categories
        • Recent
        • Tags
        • Popular
        • World
        • Users
        • Groups