What is generally the current best method for storing uploaded documents?
-
Hi forum, What is currently the best method, in terms of security as well as scalability and least complexity, to store user uploaded documents on a shared hosting platform? Is it to store the uploaded documents in a secure folder(s) location with a reference pointer (file path) in the database? Or store the documents in the database itself (blob datatype)? Or use a nosql "document store" version of the database? The documents uploaded will be: Mix of sensitive information (ex. containing a living person's date of birth) as well as historical, non-sensitive information Varying in size from 1 page or image to several dozen Varying in document type, mainly from .pdf, image files (.png, .jpeg, etc), .doc or .txt text files (there will be no audio or video file types) The number of documents stored in the first year is estimated between 100 and 500, with about 1000 to 1200 additional each of the next couple of years. If/when the site outgrows a shared hosting environment, other hosted solutions will be explored. Other info: PHP version 8.3.2 MySQL version 8.3.0 (InnoDB type used) Thanks in advance! :java:
-
Hi forum, What is currently the best method, in terms of security as well as scalability and least complexity, to store user uploaded documents on a shared hosting platform? Is it to store the uploaded documents in a secure folder(s) location with a reference pointer (file path) in the database? Or store the documents in the database itself (blob datatype)? Or use a nosql "document store" version of the database? The documents uploaded will be: Mix of sensitive information (ex. containing a living person's date of birth) as well as historical, non-sensitive information Varying in size from 1 page or image to several dozen Varying in document type, mainly from .pdf, image files (.png, .jpeg, etc), .doc or .txt text files (there will be no audio or video file types) The number of documents stored in the first year is estimated between 100 and 500, with about 1000 to 1200 additional each of the next couple of years. If/when the site outgrows a shared hosting environment, other hosted solutions will be explored. Other info: PHP version 8.3.2 MySQL version 8.3.0 (InnoDB type used) Thanks in advance! :java:
I suggest you to stick with the former approach (storing files in a filesystem). Storing large files in DB creates a lot of overhead when scanning table, inserting new rows, etc since such records span across multiple physical pages. As a rule of thumb consider database for a structured data and filesystem or arbitrary unstructured files. When it comes to NoSQL storages, most of the time you still expect the data there to conform to some schema. Their main use case is leverage horizontal scaling due to relaxed transactional guaranties (you can read more on a topic "CAP theorem" if you want to).
-
I suggest you to stick with the former approach (storing files in a filesystem). Storing large files in DB creates a lot of overhead when scanning table, inserting new rows, etc since such records span across multiple physical pages. As a rule of thumb consider database for a structured data and filesystem or arbitrary unstructured files. When it comes to NoSQL storages, most of the time you still expect the data there to conform to some schema. Their main use case is leverage horizontal scaling due to relaxed transactional guaranties (you can read more on a topic "CAP theorem" if you want to).
-
I suggest you to stick with the former approach (storing files in a filesystem). Storing large files in DB creates a lot of overhead when scanning table, inserting new rows, etc since such records span across multiple physical pages. As a rule of thumb consider database for a structured data and filesystem or arbitrary unstructured files. When it comes to NoSQL storages, most of the time you still expect the data there to conform to some schema. Their main use case is leverage horizontal scaling due to relaxed transactional guaranties (you can read more on a topic "CAP theorem" if you want to).
Bohdan Stupak wrote:
Storing large files in DB creates a lot of overhead when scanning table, inserting new rows,
That is true. But nothing in the OP suggests it will be close to that. The description suggests very few docs and the content of each is small. Plus one might also infer the churn rate is non-existent.
-
Hi forum, What is currently the best method, in terms of security as well as scalability and least complexity, to store user uploaded documents on a shared hosting platform? Is it to store the uploaded documents in a secure folder(s) location with a reference pointer (file path) in the database? Or store the documents in the database itself (blob datatype)? Or use a nosql "document store" version of the database? The documents uploaded will be: Mix of sensitive information (ex. containing a living person's date of birth) as well as historical, non-sensitive information Varying in size from 1 page or image to several dozen Varying in document type, mainly from .pdf, image files (.png, .jpeg, etc), .doc or .txt text files (there will be no audio or video file types) The number of documents stored in the first year is estimated between 100 and 500, with about 1000 to 1200 additional each of the next couple of years. If/when the site outgrows a shared hosting environment, other hosted solutions will be explored. Other info: PHP version 8.3.2 MySQL version 8.3.0 (InnoDB type used) Thanks in advance! :java:
we5inelgr wrote:
The number of documents stored in the first year is estimated between 100 and 500, with about 1000 to 1200 additional each of the next couple of years. If/when the site outgrows a shared hosting environment
Those statements seem to be contradictory. You are describing a very small data set. Unless your description is incorrect. If you go up by an order of 10, and with 5 years the number of docs are 50,000. Which might seem like a bit but your other description suggests that the size of each is pretty small. But if each is a meg then at 50k it is 50 gig of data. But my sizing might be way over. So if it is only about 6,000 and the size is 10k, then that is only 60 meg. Which is going to fit in anything that you might have.
we5inelgr wrote:
in terms of security
Secure why? You mentioned birthday. If you are a business then you need all of that encrypted. But if this is just for you then is the only security that you want is that you don't loose it? If the second is true then you need two different ways to back it up. Online and local would be best.
-
Hi forum, What is currently the best method, in terms of security as well as scalability and least complexity, to store user uploaded documents on a shared hosting platform? Is it to store the uploaded documents in a secure folder(s) location with a reference pointer (file path) in the database? Or store the documents in the database itself (blob datatype)? Or use a nosql "document store" version of the database? The documents uploaded will be: Mix of sensitive information (ex. containing a living person's date of birth) as well as historical, non-sensitive information Varying in size from 1 page or image to several dozen Varying in document type, mainly from .pdf, image files (.png, .jpeg, etc), .doc or .txt text files (there will be no audio or video file types) The number of documents stored in the first year is estimated between 100 and 500, with about 1000 to 1200 additional each of the next couple of years. If/when the site outgrows a shared hosting environment, other hosted solutions will be explored. Other info: PHP version 8.3.2 MySQL version 8.3.0 (InnoDB type used) Thanks in advance! :java:
The choice between storing files in a filesystem or as blobs in a database depends on various factors, where both approaches have their own pros and cons to consider. Filesystem - Pros- Considered generally faster for read and write operations compared to databases. Much easier to scale horizontally by adding more servers with shared access to the file system. Cons- Handling backups and recovery might be more complex especially if it grows over time. Keeping file data and related metadata consistent can be challenging. Database - Pros- Easier to maintain consistency between file data and metadata in a transactional database. Database backups usually cover both file data and metadata. Cons- Retrieving and storing large files can impact database performance. You may face scalability challenges when dealing with a large number of files.