Skip to content
  • Categories
  • Recent
  • Tags
  • Popular
  • World
  • Users
  • Groups
Skins
  • Light
  • Cerulean
  • Cosmo
  • Flatly
  • Journal
  • Litera
  • Lumen
  • Lux
  • Materia
  • Minty
  • Morph
  • Pulse
  • Sandstone
  • Simplex
  • Sketchy
  • Spacelab
  • United
  • Yeti
  • Zephyr
  • Dark
  • Cyborg
  • Darkly
  • Quartz
  • Slate
  • Solar
  • Superhero
  • Vapor

  • Default (No Skin)
  • No Skin
Collapse
Code Project
  1. Home
  2. The Lounge
  3. Adventures in Async

Adventures in Async

Scheduled Pinned Locked Moved The Lounge
htmldatabasecomgraphicsalgorithms
51 Posts 21 Posters 0 Views 1 Watching
  • Oldest to Newest
  • Newest to Oldest
  • Most Votes
Reply
  • Reply as topic
Log in to reply
This topic has been deleted. Only users with topic management privileges can see it.
  • J Jorgen Andersson

    I think I see the reason why I missed out on that possibility, it does not seem to exist on SQL Server 2012.

    Wrong is evil and must be defeated. - Jeff Ello

    abmvA Offline
    abmvA Offline
    abmv
    wrote on last edited by
    #31

    if its the dev env u have you can run the sql server setup and select the components needed to get SSIS services and vs based client tools.is on the iso or dvd etc.. .. also there is OPENROWSET [Simple way to Import XML Data into SQL Server with T-SQL](https://www.mssqltips.com/sqlservertip/5707/simple-way-to-import-xml-data-into-sql-server-with-tsql/) .....

    Caveat Emptor. "Progress doesn't come from early risers – progress is made by lazy men looking for easier ways to do things." Lazarus Long

    We are in the beginning of a mass extinction. - Greta Thunberg

    1 Reply Last reply
    0
    • OriginalGriffO OriginalGriff

      Jörgen Andersson wrote:

      netizens of the lounge to have a laugh on my behalf.

      We wouldn't do that! :laugh:

      "I have no idea what I did, but I'm taking full credit for it." - ThisOldTony AntiTwitter: @DalekDave is now a follower!

      K Offline
      K Offline
      Kirk Hawley
      wrote on last edited by
      #32

      No. Programming is hard.

      Recursion is for programmers who haven't blown enough stacks yet.

      1 Reply Last reply
      0
      • OriginalGriffO OriginalGriff

        Jörgen Andersson wrote:

        netizens of the lounge to have a laugh on my behalf.

        We wouldn't do that! :laugh:

        "I have no idea what I did, but I'm taking full credit for it." - ThisOldTony AntiTwitter: @DalekDave is now a follower!

        K Offline
        K Offline
        Kirk Hawley
        wrote on last edited by
        #33

        OriginalGriff wrote:

        Jörgen Andersson wrote:

        netizens of the lounge to have a laugh on my behalf.

        We wouldn't do that! :laugh:

        No. Programming is hard.

        Recursion is for programmers who haven't blown enough stacks yet.

        1 Reply Last reply
        0
        • J Jorgen Andersson

          Never bothered with Async programming before since I never needed it. But now I'm having to take care of a weekly delivery of an 80 GB (eighty gigabyte) large XML-file. The parsing and saving 10 million records to 30 different tables in a database takes more than an hour and there's no simple optimization left to do. But I only use one kernel in the processor, so let's go parallell, it'll be fun learning. Right? Easiest part is bulk copying to the database in parallel. Easy enough but it only shaves five minutes from the total time. This is not where the biggest bottleneck is. The biggest bottleneck is the actual parsing of the XML. I don't want to rework the whole application into using locks and thread-safe collections so I decide to split the work vertically instead. Add a task for every collection of data. Also easy enough, now the processor is working close to 100%, but it takes twice as long. :wtf: Apparently the creation of tasks has more overhead than the parsing of the data itself. :laugh: No shortcuts for me today. Back to the drawing board.

          Wrong is evil and must be defeated. - Jeff Ello

          N Offline
          N Offline
          Nelek
          wrote on last edited by
          #34

          I can't help, but reading "parse" in the body of the message... This is clearly a case for... HONEY THE @CODE-WITCH tatatataaaaaaa :laugh: :laugh: :laugh: :laugh:

          M.D.V. ;) If something has a solution... Why do we have to worry about?. If it has no solution... For what reason do we have to worry about? Help me to understand what I'm saying, and I'll explain it better to you Rating helpful answers is nice, but saying thanks can be even nicer.

          1 Reply Last reply
          0
          • J Jorgen Andersson

            Never bothered with Async programming before since I never needed it. But now I'm having to take care of a weekly delivery of an 80 GB (eighty gigabyte) large XML-file. The parsing and saving 10 million records to 30 different tables in a database takes more than an hour and there's no simple optimization left to do. But I only use one kernel in the processor, so let's go parallell, it'll be fun learning. Right? Easiest part is bulk copying to the database in parallel. Easy enough but it only shaves five minutes from the total time. This is not where the biggest bottleneck is. The biggest bottleneck is the actual parsing of the XML. I don't want to rework the whole application into using locks and thread-safe collections so I decide to split the work vertically instead. Add a task for every collection of data. Also easy enough, now the processor is working close to 100%, but it takes twice as long. :wtf: Apparently the creation of tasks has more overhead than the parsing of the data itself. :laugh: No shortcuts for me today. Back to the drawing board.

            Wrong is evil and must be defeated. - Jeff Ello

            H Offline
            H Offline
            honey the codewitch
            wrote on last edited by
            #35

            are you sure the bottleneck isnt disk i/o?

            Real programmers use butterflies

            J 1 Reply Last reply
            0
            • J Jorgen Andersson

              Never bothered with Async programming before since I never needed it. But now I'm having to take care of a weekly delivery of an 80 GB (eighty gigabyte) large XML-file. The parsing and saving 10 million records to 30 different tables in a database takes more than an hour and there's no simple optimization left to do. But I only use one kernel in the processor, so let's go parallell, it'll be fun learning. Right? Easiest part is bulk copying to the database in parallel. Easy enough but it only shaves five minutes from the total time. This is not where the biggest bottleneck is. The biggest bottleneck is the actual parsing of the XML. I don't want to rework the whole application into using locks and thread-safe collections so I decide to split the work vertically instead. Add a task for every collection of data. Also easy enough, now the processor is working close to 100%, but it takes twice as long. :wtf: Apparently the creation of tasks has more overhead than the parsing of the data itself. :laugh: No shortcuts for me today. Back to the drawing board.

              Wrong is evil and must be defeated. - Jeff Ello

              P Offline
              P Offline
              Padanian
              wrote on last edited by
              #36

              40GB of your 80GB XML file are tags. So much for the overhead. The suggestion is to worldwide drop all markup languages (XML, JSON and similar shiite)

              K J 2 Replies Last reply
              0
              • P Padanian

                40GB of your 80GB XML file are tags. So much for the overhead. The suggestion is to worldwide drop all markup languages (XML, JSON and similar shiite)

                K Offline
                K Offline
                kalberts
                wrote on last edited by
                #37

                But compact binary formats are almost impossible to patch up using vi. Linux guys will feel completely lost!

                P 1 Reply Last reply
                0
                • K kalberts

                  But compact binary formats are almost impossible to patch up using vi. Linux guys will feel completely lost!

                  P Offline
                  P Offline
                  Padanian
                  wrote on last edited by
                  #38

                  And?

                  K 1 Reply Last reply
                  0
                  • P Padanian

                    40GB of your 80GB XML file are tags. So much for the overhead. The suggestion is to worldwide drop all markup languages (XML, JSON and similar shiite)

                    J Offline
                    J Offline
                    Jorgen Andersson
                    wrote on last edited by
                    #39

                    At least 55GB actually. The Database I'm copying the data to is only 25GB at the moment and that's also having quite some overhead.

                    Wrong is evil and must be defeated. - Jeff Ello

                    P 1 Reply Last reply
                    0
                    • H honey the codewitch

                      are you sure the bottleneck isnt disk i/o?

                      Real programmers use butterflies

                      J Offline
                      J Offline
                      Jorgen Andersson
                      wrote on last edited by
                      #40

                      Yes, just to make sure. I've made test runs just reading an ID from every record which goes twice as fast, and that's on a slow HDD here at home. And when I move this to a server the disks will be considerably faster.

                      Wrong is evil and must be defeated. - Jeff Ello

                      1 Reply Last reply
                      0
                      • J Jorgen Andersson

                        At least 55GB actually. The Database I'm copying the data to is only 25GB at the moment and that's also having quite some overhead.

                        Wrong is evil and must be defeated. - Jeff Ello

                        P Offline
                        P Offline
                        Padanian
                        wrote on last edited by
                        #41

                        There you go. Something went terribly wrong in the history of computing.

                        1 Reply Last reply
                        0
                        • J Jorgen Andersson

                          OriginalGriff wrote:

                          Oh yes, as soon as your thread count exceeds the core count, you are going to get some slowdown.

                          Didn't even do that. :) I'm fully aware of where I wen't wrong. I posted it for netizens of the lounge to have a laugh on my behalf. In this case the specific problem is that the piece of work is smaller than the cost of creating tasks. And my error in the bigger picture is that one cannot simply convert a task running in sync to one running in async. It has to be purpose built.

                          Wrong is evil and must be defeated. - Jeff Ello

                          L Offline
                          L Offline
                          Leo56
                          wrote on last edited by
                          #42

                          "..

                          Quote:

                          my error in the bigger picture is that one cannot simply convert a task running in sync to one running in async. It has to be purpose built.

                          Amen brother - been there, seen that, still feel the pain... :sigh:

                          1 Reply Last reply
                          0
                          • P Padanian

                            And?

                            K Offline
                            K Offline
                            kalberts
                            wrote on last edited by
                            #43

                            Lots of Linux guys working in Windows environments. Lots of them hate it, they do it just to earn money so they can pay for their home computers to contribute to Linux based open source projects in their spare time. And they spend a lot of energy on bithcing about things not being exactly as they are used to in the Linux world. My comment was "based on a true story". I made one application storing a fairly complex persistent data structure in a binary format. This was met with heavy critisism: What if that data structure becomes inconsistent - how can we fix up the inconsistencies when it is not in a readable format? I guess I wasn't too polite when answering them that one major reason for not using a readable format was to prevent them from poking into the file with vi, introducing inconsistencies. In this system I am working on now: It is a Windows desktop application, but there is a function for converting all file system paths to Unix style forward slash path separators, and a handful utility functions that fails if you submit a DOS/Windows style path with backwards slashes. Forward slashes is the only "correct" path format, they claim - DOS/Windows was simply wrong until they started accepting the correct format. So the (Windows) users of this program must simply accept that when using the conventions of their OS, they are simply wrong. In an earlier project, the Linux mafia forced me to make special adaptations in my (very) Windows-specific utililty: They inisisted on running it, in their shell based batch jobs, from a Linux-adapted command shell that enforced case sensitive environment symbols. They make use of it, too: Their jobs started crashing, and it boiled down to my utility treating symbols differing only in case as synonyms, while they were distinct in their jobs. In my current project, one of the first thing I did was to replace case sensitive file name comparisons with case insensitive ones. It was argued, "But cmake always uses CMakeLists.txt, with exactly that casing! There is no need to do a case insensitive comparison!" Well... Why did the program then barf? Someone wrote CmakeLists.txt, and the program just failed, because it didn't find the file. I would have tolerated this a lot more if it wasn't for the constant bitching from the Linux mafia about Windows users refusing to learn anything new, but cling to Windows ways of doing things (when working under Windows) rather than learning the way these wonderful command-line utilitie

                            P J R 3 Replies Last reply
                            0
                            • K kalberts

                              Lots of Linux guys working in Windows environments. Lots of them hate it, they do it just to earn money so they can pay for their home computers to contribute to Linux based open source projects in their spare time. And they spend a lot of energy on bithcing about things not being exactly as they are used to in the Linux world. My comment was "based on a true story". I made one application storing a fairly complex persistent data structure in a binary format. This was met with heavy critisism: What if that data structure becomes inconsistent - how can we fix up the inconsistencies when it is not in a readable format? I guess I wasn't too polite when answering them that one major reason for not using a readable format was to prevent them from poking into the file with vi, introducing inconsistencies. In this system I am working on now: It is a Windows desktop application, but there is a function for converting all file system paths to Unix style forward slash path separators, and a handful utility functions that fails if you submit a DOS/Windows style path with backwards slashes. Forward slashes is the only "correct" path format, they claim - DOS/Windows was simply wrong until they started accepting the correct format. So the (Windows) users of this program must simply accept that when using the conventions of their OS, they are simply wrong. In an earlier project, the Linux mafia forced me to make special adaptations in my (very) Windows-specific utililty: They inisisted on running it, in their shell based batch jobs, from a Linux-adapted command shell that enforced case sensitive environment symbols. They make use of it, too: Their jobs started crashing, and it boiled down to my utility treating symbols differing only in case as synonyms, while they were distinct in their jobs. In my current project, one of the first thing I did was to replace case sensitive file name comparisons with case insensitive ones. It was argued, "But cmake always uses CMakeLists.txt, with exactly that casing! There is no need to do a case insensitive comparison!" Well... Why did the program then barf? Someone wrote CmakeLists.txt, and the program just failed, because it didn't find the file. I would have tolerated this a lot more if it wasn't for the constant bitching from the Linux mafia about Windows users refusing to learn anything new, but cling to Windows ways of doing things (when working under Windows) rather than learning the way these wonderful command-line utilitie

                              P Offline
                              P Offline
                              Padanian
                              wrote on last edited by
                              #44

                              This should be published to ever lasting memory. My congrats.

                              1 Reply Last reply
                              0
                              • OriginalGriffO OriginalGriff

                                Oh yes, as soon as your thread count exceeds the core count, you are going to get some slowdown. You need to be aware that threading is not a "magic bullet" that will solve all your performance woes at a stroke - it needs to be carefully though about and planned, or it can do two things: 1) Slow your machine to a crawl, and make your application considerably slower than it started out. 2) Crash or lock up your app completely. The reasons why are simple: 1) Threads require two things to run: memory and a free core. The memory will be at the very least the size of a system stack in your language (usually around 1MB for Windows, 8MB for Linux) plus some overhead for the thread itself and yet more for any memory based objects each thread creates; and a thread can only run when a core becomes available. If you generate more threads than you have cores then most of them will spend a lot of time sitting waiting for a thread to be available. The more threads you generate, the worse problems become: more threads puts more load on the system to switch threads more often and that takes core time as well. All threads ion the system form all processes share the cores in the machine, so other apps and System threads also need their time to run. Add too many, and the system will spend more and more of it's time trying to work out which thread to run and performance degrades. Generate enough threads to exceed the physical memory in your computer and performance suddenly takes an enormous hit as the virtual memory system comes in and starts thrashing memory pages to the HDD. 2) Multiple threads within a process have to be thread safe because they share memory and other resources - which means that several things can happen: 2a) If two threads need the same resource then you can easily end up in a situation where thread A has locked resource X and wants Y, while thread B has locked resource Y and wants X. At this point a "deadly embrace" has occurred and no other thread (nor any other that need X or Y can run ever again. 2b) If your code isn't thread safe, then different threads can try to read and / or alter the same memory at the same time: this often happens when trying to add or remove items from a collection. At this point strange things start to happen up to and including your app crashing. 2c) If resources have a finite capacity - like the bandwidth on an internet connection for example - then bad threading can easily use it all - at either end of the link. If you run out of capacity, your threads will stall

                                M Offline
                                M Offline
                                Member_14678461
                                wrote on last edited by
                                #45

                                Good analysis. Best way to think about it is this: yourself, you cannot really multitask. You can time slice (we used to call this time share), or you can delegate. Everything done internally is really just time slicing, partitioned according to the rules and privileges you assign to processes, and thread within those processes. ThisOldTony has it right; I am just an echo.

                                1 Reply Last reply
                                0
                                • J Jorgen Andersson

                                  Never bothered with Async programming before since I never needed it. But now I'm having to take care of a weekly delivery of an 80 GB (eighty gigabyte) large XML-file. The parsing and saving 10 million records to 30 different tables in a database takes more than an hour and there's no simple optimization left to do. But I only use one kernel in the processor, so let's go parallell, it'll be fun learning. Right? Easiest part is bulk copying to the database in parallel. Easy enough but it only shaves five minutes from the total time. This is not where the biggest bottleneck is. The biggest bottleneck is the actual parsing of the XML. I don't want to rework the whole application into using locks and thread-safe collections so I decide to split the work vertically instead. Add a task for every collection of data. Also easy enough, now the processor is working close to 100%, but it takes twice as long. :wtf: Apparently the creation of tasks has more overhead than the parsing of the data itself. :laugh: No shortcuts for me today. Back to the drawing board.

                                  Wrong is evil and must be defeated. - Jeff Ello

                                  K Offline
                                  K Offline
                                  KateAshman
                                  wrote on last edited by
                                  #46

                                  XML parsing is a forgotten art, but I have a story that might inspire your creativity. 😉 I once had to build an XML parser that could process a 1.8 gig file on demand, with the intent of generating C++ header files. The hard part was terrible formatting and not being able to pre-process the darn thing, which forced me to use a single pass multi-line regex implementation. After a couple of weeks struggling with it, my biggest time save was finally gained by switching to a stream reader. When I read your story, my first idea was to use a non-locking stream reader and simply running the thing 4 times on 4 cores. Personally, I don't see the need for parallelism in this instance and I think it's a red herring to be honest. Anyways, good luck. 👍

                                  J 1 Reply Last reply
                                  0
                                  • K kalberts

                                    Lots of Linux guys working in Windows environments. Lots of them hate it, they do it just to earn money so they can pay for their home computers to contribute to Linux based open source projects in their spare time. And they spend a lot of energy on bithcing about things not being exactly as they are used to in the Linux world. My comment was "based on a true story". I made one application storing a fairly complex persistent data structure in a binary format. This was met with heavy critisism: What if that data structure becomes inconsistent - how can we fix up the inconsistencies when it is not in a readable format? I guess I wasn't too polite when answering them that one major reason for not using a readable format was to prevent them from poking into the file with vi, introducing inconsistencies. In this system I am working on now: It is a Windows desktop application, but there is a function for converting all file system paths to Unix style forward slash path separators, and a handful utility functions that fails if you submit a DOS/Windows style path with backwards slashes. Forward slashes is the only "correct" path format, they claim - DOS/Windows was simply wrong until they started accepting the correct format. So the (Windows) users of this program must simply accept that when using the conventions of their OS, they are simply wrong. In an earlier project, the Linux mafia forced me to make special adaptations in my (very) Windows-specific utililty: They inisisted on running it, in their shell based batch jobs, from a Linux-adapted command shell that enforced case sensitive environment symbols. They make use of it, too: Their jobs started crashing, and it boiled down to my utility treating symbols differing only in case as synonyms, while they were distinct in their jobs. In my current project, one of the first thing I did was to replace case sensitive file name comparisons with case insensitive ones. It was argued, "But cmake always uses CMakeLists.txt, with exactly that casing! There is no need to do a case insensitive comparison!" Well... Why did the program then barf? Someone wrote CmakeLists.txt, and the program just failed, because it didn't find the file. I would have tolerated this a lot more if it wasn't for the constant bitching from the Linux mafia about Windows users refusing to learn anything new, but cling to Windows ways of doing things (when working under Windows) rather than learning the way these wonderful command-line utilitie

                                    J Offline
                                    J Offline
                                    Jorgen Andersson
                                    wrote on last edited by
                                    #47

                                    As I see it: There must be a reason people are paying for not having to use a free OS.

                                    Wrong is evil and must be defeated. - Jeff Ello

                                    1 Reply Last reply
                                    0
                                    • K KateAshman

                                      XML parsing is a forgotten art, but I have a story that might inspire your creativity. 😉 I once had to build an XML parser that could process a 1.8 gig file on demand, with the intent of generating C++ header files. The hard part was terrible formatting and not being able to pre-process the darn thing, which forced me to use a single pass multi-line regex implementation. After a couple of weeks struggling with it, my biggest time save was finally gained by switching to a stream reader. When I read your story, my first idea was to use a non-locking stream reader and simply running the thing 4 times on 4 cores. Personally, I don't see the need for parallelism in this instance and I think it's a red herring to be honest. Anyways, good luck. 👍

                                      J Offline
                                      J Offline
                                      Jorgen Andersson
                                      wrote on last edited by
                                      #48

                                      I'm probably skipping it. The sequential program works and can run in the background without any problems. For me it was mostly an educational experience.

                                      Wrong is evil and must be defeated. - Jeff Ello

                                      1 Reply Last reply
                                      0
                                      • J Jorgen Andersson

                                        Never bothered with Async programming before since I never needed it. But now I'm having to take care of a weekly delivery of an 80 GB (eighty gigabyte) large XML-file. The parsing and saving 10 million records to 30 different tables in a database takes more than an hour and there's no simple optimization left to do. But I only use one kernel in the processor, so let's go parallell, it'll be fun learning. Right? Easiest part is bulk copying to the database in parallel. Easy enough but it only shaves five minutes from the total time. This is not where the biggest bottleneck is. The biggest bottleneck is the actual parsing of the XML. I don't want to rework the whole application into using locks and thread-safe collections so I decide to split the work vertically instead. Add a task for every collection of data. Also easy enough, now the processor is working close to 100%, but it takes twice as long. :wtf: Apparently the creation of tasks has more overhead than the parsing of the data itself. :laugh: No shortcuts for me today. Back to the drawing board.

                                        Wrong is evil and must be defeated. - Jeff Ello

                                        R Offline
                                        R Offline
                                        Raphael Muindi Jr
                                        wrote on last edited by
                                        #49

                                        Are you, by any chance, using Linq?

                                        J 1 Reply Last reply
                                        0
                                        • K kalberts

                                          Lots of Linux guys working in Windows environments. Lots of them hate it, they do it just to earn money so they can pay for their home computers to contribute to Linux based open source projects in their spare time. And they spend a lot of energy on bithcing about things not being exactly as they are used to in the Linux world. My comment was "based on a true story". I made one application storing a fairly complex persistent data structure in a binary format. This was met with heavy critisism: What if that data structure becomes inconsistent - how can we fix up the inconsistencies when it is not in a readable format? I guess I wasn't too polite when answering them that one major reason for not using a readable format was to prevent them from poking into the file with vi, introducing inconsistencies. In this system I am working on now: It is a Windows desktop application, but there is a function for converting all file system paths to Unix style forward slash path separators, and a handful utility functions that fails if you submit a DOS/Windows style path with backwards slashes. Forward slashes is the only "correct" path format, they claim - DOS/Windows was simply wrong until they started accepting the correct format. So the (Windows) users of this program must simply accept that when using the conventions of their OS, they are simply wrong. In an earlier project, the Linux mafia forced me to make special adaptations in my (very) Windows-specific utililty: They inisisted on running it, in their shell based batch jobs, from a Linux-adapted command shell that enforced case sensitive environment symbols. They make use of it, too: Their jobs started crashing, and it boiled down to my utility treating symbols differing only in case as synonyms, while they were distinct in their jobs. In my current project, one of the first thing I did was to replace case sensitive file name comparisons with case insensitive ones. It was argued, "But cmake always uses CMakeLists.txt, with exactly that casing! There is no need to do a case insensitive comparison!" Well... Why did the program then barf? Someone wrote CmakeLists.txt, and the program just failed, because it didn't find the file. I would have tolerated this a lot more if it wasn't for the constant bitching from the Linux mafia about Windows users refusing to learn anything new, but cling to Windows ways of doing things (when working under Windows) rather than learning the way these wonderful command-line utilitie

                                          R Offline
                                          R Offline
                                          Raphael Muindi Jr
                                          wrote on last edited by
                                          #50

                                          Hahahaha "

                                          I would have tolerated this a lot more if it wasn't for the constant bitching from the Linux mafia about Windows users refusing to learn anything new, but cling to Windows ways of doing things (when working under Windows) rather than learning the way these wonderful command-line utilities ported from the wonderful world of free and unsupported software expects you to put everything in a loooong command line.

                                          " Where are the Async adventures.

                                          1 Reply Last reply
                                          0
                                          Reply
                                          • Reply as topic
                                          Log in to reply
                                          • Oldest to Newest
                                          • Newest to Oldest
                                          • Most Votes


                                          • Login

                                          • Don't have an account? Register

                                          • Login or register to search.
                                          • First post
                                            Last post
                                          0
                                          • Categories
                                          • Recent
                                          • Tags
                                          • Popular
                                          • World
                                          • Users
                                          • Groups