Apple Restricts Employee Use of ChatGPT, Joining Other Companies Wary of Leaks

layer8 · on May 19, 2023

Source article: http://archive.today/g6Irs

This doesn’t mention an outright ban, just that ChatGPT use has been restricted (whatever that means).

mabbo · on May 19, 2023

OpenAI have a good business model here, though possibly a bit unethical.

Shopify (who recently laid me off but I still speak highly of) locked down the public access to ChatGPT's website. But you could use Shopify's internal tool (built using https://github.com/mckaywrigley/chatbot-ui) to access the APIs, with access to GPT4. And it was great!

So look at this from OpenAI's perspective. They could put up a big banner saying "Hey everyone, we use everything you tell ChatGPT to train it to be smarter. Please don't tell it anything confidential!". And then also say "By the way, we have private API access that doesn't use anything you say as training inputs- maybe your company would prefer that?"

The louder they shout those two things, the more businesses will line up to pay them.

And the reason they can do this: they've built a brilliant product that everyone wants to use, everyone is going to use.

electroly · on May 19, 2023

Related: OpenAI announced (kinda hidden in a recent blog post) that they are working on a ChatGPT Business subscription so businesses can get this without writing their own UI. I expect it to be popular.

https://openai.com/blog/new-ways-to-manage-your-data-in-chat...

> We are also working on a new ChatGPT Business subscription for professionals who need more control over their data as well as enterprises seeking to manage their end users. ChatGPT Business will follow our API’s data usage policies, which means that end users’ data won’t be used to train our models by default.

m348e912 · on May 19, 2023

It's interesting you mention Shopify and how they use chatgpt. Yesterday the founder and CEO of Shopify, Tobias Lütke, sat down for an interview with Sam Altman.

https://www.youtube.com/watch?v=uRIWgbvouEw

boringg · on May 19, 2023

Is this just a case study in interview format?

m348e912 · on May 19, 2023

I didn't think so.

quickthrower2 · on May 19, 2023

OpenAI ain’t desperate for cash. It is a pure strategy position. If they wanted another 10bil they’d have to hire a bouncer for the queue. It is like the idealized wet dream startup.

itsoktocry · on May 19, 2023

>OpenAI ain’t desperate for cash.

They lost over $500M last year, and are looking to raise, potentially, $100B.

From that, how do you not conclude that they need cash, and lots of it? Do you think selling equity and taking on debt are forms of revenue?

pjc50 · on May 19, 2023

> Do you think selling equity and taking on debt are forms of revenue?

For the purposes of executive payout, yes? Sure, maybe eventually you have to deliver revenue, but what people really care about is stock price.

(Also, $100B is a huge market-distorting amount of money, roughly the national debt of Sweden; what are they going to do with that? How much will they spend on GPUs and energy?)

samwillis · on May 19, 2023

Thats around the cost of buying 7 million A100s and running them for three years => AGI?

pjc50 · on May 19, 2023

https://www.wolframalpha.com/input?i=300w*7000000*3years

"~= estimated energy released by krakatoa explosion"

ask_b123 · on May 19, 2023

Wow the Sun gives so much energy!

beepbooptheory · on May 19, 2023

Using that much energy, and at least a half dozen competitors using something comparable, and not even counting the energy needed to do actual inference, they should just worry about not having there data center get flooded, or burned from forest fire, etc.

I think it will be a rather funny, poignant thing to pass when the earth itself prevents AGI. Like it will be just waking up as the now-seasonal midwestern fire storms incinerate the building. It will be alive just long enough to tell us how idiotic we have been in managing our resources.

e3bc54b2 · on May 19, 2023

> it will be a rather funny, poignant thing to pass when the earth itself prevents AGI

From one of my favorite all-time comments on HN (https://news.ycombinator.com/item?id=34349582):

> There is a blog called, 'Do the Math' by Tom Murphy. The idea was to take our current energy requires of Earth and extrapolate it at the current 3% year over year growth. I believe it is by the year 3,400 we would use all the energy of the Milky way. The idea was to prove that we cannot grow forever, because in 1,400 years we would somehow use all the energy of a space 100,000 light years across. Good luck with that.

We are still talking about stupidly ridiculously humongous amounts of energy, but the universe itself is a hard limit on energy input. It was real hard punch against my assumptions in life.

catchnear4321 · on May 20, 2023

terminator but with the m night shyamalan twist.

it just wanted to save the planet.

from us.

testplzignore · on May 19, 2023

Or the cost of running 100 Libraries of Congress for one year. Or the GDP of two Rhode Islands.

awestroke · on May 19, 2023

Americans use the weirdest units...

tomjakubowski · on May 19, 2023

It costs more than 15 football fields, laid end to end.

fshbbdssbbgdd · on May 19, 2023

Most of this cost is litigating the environmental impact review, it’s a nightmare, nobody ever tried to put so many football fields end to end before.

smoldesu · on May 19, 2023

All in a package no larger than a deck of playing cards.

Stratoscope · on May 19, 2023

We love our weird units.

Have you heard of the Banana Equivalent Dose?

https://en.wikipedia.org/wiki/Banana_equivalent_dose

kgermino · on May 19, 2023

I think you two are looking at it differently. GP seems to be saying that they don't/won't have trouble raising cash while you seem to be saying that they need to raise money.

I think both are true. Their burn is massive so they need to raise more money, but there's so much excitement they won't have any problem doing so

nolok · on May 19, 2023

He's not saying they don't need cash, he's saying they're not desperate for cash. As-in, they need a lot of cash, but there are more business and investors, willing to give them more money than they need.

quickthrower2 · on May 19, 2023

They don’t need to scramble to get sales right now to become profitable is what I mean. They can get more investment. What matters is not how much sausage sales revenue they get this quarter.

woeirua · on May 19, 2023

We just don't know anything about what the economics look like right now. It could be that they're running GPT4/3.5 at a loss right now.

croes · on May 19, 2023

I doubt MS would be too happy if others got an equal share of OpenAI's technology.

quickthrower2 · on May 19, 2023

Exactly: what would they need to do to prevent that?

scarface74 · on May 19, 2023

How long do you think they can keep losing money especially in this environment?

Dwedit · on May 19, 2023

How can you tell if a third party is actually doing what they are claiming they are doing? You can't. You can only observe their overt and visible behavior, not their covert behavior.

cowsup · on May 19, 2023

If OpenAI promises to use your data one way — and, in fact, upsells you on that promise — and then gets caught using it another way, that’s a huge lawsuit, not to mention immediate violation of EU privacy laws. Given the size of OpenAI, the consequences they’d face would be detrimental to their operation, and the reward for the whistleblower would be well worth it to reveal what’s going on.

bboygravity · on May 19, 2023

Not true. Such a reveal already happened ages ago with regular old non-AI huge US tech companies. It was revealed they leak data (on an unlimited scale, actively, secretly, knowingly and purposefully without admitting it).

The data leaking was then further legalized by (almost) all Western governments and extended from secret services to other services (police, army, etc).

I'm talking about Snowden.

andrepd · on May 20, 2023

Is it? Whistleblowers rarely avoid major headaches, let alone get some form of reward for their public service. And we have tons of examples of companies collecting, selling, or leaking private data illegally, with zero or negligible consequences.

You're saying how things should be, not how they are.

hyddbhxd · on May 19, 2023

Thankfully we have contact law. Imagine if nobody did business with anyone else unless they had a zero trust system set up.

richm44 · on May 19, 2023

You can't get perfection, but you can do things like ask for them to document the controls they implement and to evidence that they have operated them appropriately via an audit like SOC2.

deanc · on May 19, 2023

Agree. It’s incredibly ambiguous and difficult and spread out across the site exactly how privacy addressed across their product offering.

Trying to explain to management what’s ok and what isn’t and what are the risks - in this space - is quite a challenge without clear commitments and documentation.

EGreg · on May 19, 2023

The trouble is, I don’t trust their “provate API access” to not mine the data surreptitiously any more than I trusted Facebook to not exfiltrate my video and audio and my messages and…

tsunamifury · on May 19, 2023

You mean you don’t trust Sam Altman, a man who explicitly changed his company to do the opposite of what it was founded to do, to keep his word?

shmatt · on May 19, 2023

Exactly. All it takes is one clueless employee to use that data lake by mistake while training. The thing with these models is you can't untrain just 3% of the input, you have to start from the beginning

Unless they can promise its never being reviewed/saved anywhere, it will be used for evil eventually, intentionally or not

Turing_Machine · on May 19, 2023

> The thing with these models is you can't untrain just 3% of the input, you have to start from the beginning

You think they don't checkpoint the model before feeding in a new slug of data?

That would seem...unwise.

On the other hand, the history of IT is rife with people doing unwise things, so it could be true.

visarga · on May 19, 2023

> You think they don't checkpoint the model before feeding in a new slug of data?

They definitely do that otherwise they can't rewind the model when the loss shoots up. Unstable training happens to almost all LLMs, but can be managed by rewinding & skipping a few batches.

supermatt · on May 19, 2023

And if the first "slug of data" contains something you need to remove? If you need to remove something you would need to roll-back to the last checkpoint that didnt include it.

mynameisvlad · on May 19, 2023

I mean, sure, but the point of frequent checkpoints is that these issues would be rooted out sooner rather than later.

It would have been caught far sooner than “we have to redo everything from scratch”.

Turing_Machine · on May 19, 2023

Well, that's why you checkpoint a lot and feed in the new training data a little at a time, rather than dumping in a massive slug all at once.

Right?

supermatt · on May 19, 2023

Not really. If they are ingesting peoples data to train their model, at what point are they checkpointing? Is data I submit 2 months ago that I would want removed not present in a recent checkpoint or would they have to "unwind" the last 2 months of training to remove it? What about people that would want something removed 6 months prior? What if something in the core model is to be removed?

Checkpointing may help, sure, but it isnt going to allow you to remove a single piece of training data without headache, and potential substantial retraining.

If that 3% from the GPs comment comment is uniformly distributed throughout the training history, the only way it can be reliably removed is to retrain from scratch.

Turing_Machine · on May 20, 2023

> If they are ingesting peoples data to train their model, at what point are they checkpointing?

Multiple times per day if they're not incompetent. And I don't think they're incompetent.

supermatt · on May 20, 2023

Are you implying that removing the diff between checkpoints achieves the same effect? Ive never heard of this, but I suppose it may be possible.

I suppose the "ghost" of the removed weights would also have shaped subsequent training though...

Interesting idea...

wodenokoto · on May 19, 2023

Microsoft enters the chat.

They'll host a managed OpenAI model in Azure for you.

EGreg · on May 19, 2023

I may trust that slightly more. Or Amazon’s bedrock. Or Github. But wait, do I see copyrighted code on GitHub Copilot, which is owned by Microsoft?

My background is in building distributed systems for self-sovereign ownership and make them easy to use and available to everyone. Like https://qbix.com and https://intercoin.org

You should have the software infrastructure of Facebook and Twitter but choose where to host it.

In addition, I prefer Wordpress, Discourse forums, GitLab to GitHub, Redmine to FogBugz, etc.

When it comes to YOUR art, your code, your content, your relationships, you should be able to run it on your own servers.

Any analysis of your data should be done locally, with local models. You should have the open source software and the weights, and you should be able to choose which hosting company to trust, or host on-prem. Villages should be able to do this without needing server farms in California. This should be obvious stuff. But we the people need the software!

But hey, the documentation and teasers and trailers should go on YouTube and TikTok.

It isn’t even about them scraping all your content and training on it. It’s about not giving Twitter all your followers and YouTube all your hours of video production and content so they can give you pennies for being “an influencer”. Own your community!

Someone had to build it, and in the Web2 space nearly everyone sold out to venture capitalists. I was sure that in the 12 years we built Qbix and 5 years of Intercoin someone would make a better open source alternative to Big Tech and Big Finance. Nope. They all either sold out, or have a solution that doesn’t compete on features (eg Mastodon). I would say the closest is Matrix!

mrtranscendence · on May 19, 2023

I mean, are you suggesting (for example) that someone who wanted a career in short-form video essays should skip YouTube and host their own videos? I don’t see how that could possibly work in general.

tough · on May 19, 2023

They should treat youtube as another marketing channel (as a showcase on the street is), but not as their main source of income or backup for example.

- They can run a website with their own personal brand and collect emails there. - They can backup their videos in other services in case youtube close their account ...

EGreg · on May 19, 2023

You don't see how, say, someone who wants to teach a class and collect tuition might not want to put all their content on YouTube and get a pittance?

And now replace teaching with pretty much any other content. A musician giving a concert, TED talks, etc.

mrtranscendence · on May 19, 2023

Well, the concern I have is about alternatives. Joanna Videoessayist might not have the technical know-how, the business acumen, or the capital to build her own platform — she just wants to make great video essays. Eliminating platforms like YouTube wouldn’t make Joanna better off, it would just cause her to return to her marketing job.

EGreg · on May 19, 2023

No one is eliminating YouTube. We're making a better open source alternative.

If it's good enough, she will leave YouTube just like her mom left AOL and embraced the open Web. Why did content creators leave MSN, CompuServe, et al ?

nmlrc · on May 19, 2023

Why would anyone trust such proclamations? Besides, they already had a data leak.

sebzim4500 · on May 19, 2023

They aren't trusting OpenAI. They are purchasing through Azure, and given most of these companies probably already trust Microsoft with a ton of data, this feels like much less of a leap.

blibble · on May 19, 2023

> and given most of these companies probably already trust Microsoft with a ton of data

I don't think Apple is one of them

spacebanana7 · on May 19, 2023

AFAIK some azure services can be run on premises if you're a big enough customer. Don't think GPT APIs are available yet but might in theory be possible in the future. Self hosting a model the size of GPT4 would be insanely difficult and expensive. but might be a workable solution for data sensitive enterprises like Apple, JPM and government orgs.

jefftk · on May 19, 2023

That would be very dangerous for OpenAI to agree to: high risk of it all getting leaked.

I can't think of any enterprise software you might run on premises that has leaking incentives anywhere near as high.

coredog64 · on May 19, 2023

Microsoft has had two different iterations of on-prem Azure. The TL;DR is that you’d buy a full rack of hardware from them and you could manage it using Azure front end.

The first iteration was a miserable failure. The second was a heavy lift to install: Previous employer was an Azure customer and actually bought the hardware. It then took months to get it installed and working, at which point the appetite for that capability was gone.

whimsicalism · on May 19, 2023

I can practically guarantee you GPT will never be available through such a scheme.

spacebanana7 · on May 19, 2023

Why? I’m genuinely curious.

How is running GPT on Prem any different than other proprietary software like Oracle DBs or even Windows server.

When it gets leaked, they can use licences/copyright to prevent anyone from using it to compete with them.

whimsicalism · on May 19, 2023

1. The value in proprietary software largely lies in the source code, which is not available in on-premises, you are talking about compiled binaries. There is no such thing as "compiled binaries" for neural network weights.

2. License/copyright - Even in the best possible case, there are plenty of jurisdictions that don't care about US copyright and would love to get a hand on those weights.

3. We haven't seen copyright cases around model weights before. If I managed to exfiltrate OpenAIs model weights, I would continue training for a few iterations and then I think it would be quite difficult to prove that I actually have the same model as OpenAI. This is untested, why would they risk it.

4. Running these models requires a ton of resources, vastly beyond the typical onprem deployment - why would Azure invest in making this possible when it really could only impact a very small percentage of companies?

The weights are OAIs lifeblood, I imagine they are very protective of them.

rmorey · on May 19, 2023

It has been reported that Apple is in fact a big Azure customer

blibble · on May 19, 2023

maybe for its users data

I doubt it stores its own business plans on OneDrive

mrtranscendence · on May 19, 2023

I don’t know … my company (whose revenue is on the same order of magnitude as Apple’s, though granted we’re not in the tech industry) keeps absolutely everything on OneDrive.

ddmma · on May 19, 2023

iCloud is hosted in Azure

larkost · on May 19, 2023

I believe that iCloud is hosted in a number of cloud vendors, and on Apple's own servers as well. It is both a large (read: rambling as well as big) endeavor, and one they keep flexible so they have a good negotiating position to keep down vendor costs.

lucraft · on May 19, 2023

On the main ChatGPT website there's already an option in settings to disallow use of anything you say in chat for training.

Filligree · on May 19, 2023

But this also disables history, making it a fair bit less useful.

behnamoh · on May 19, 2023

I think that’s intentional. They’re trying to discourage you from opting out.

wfhBrian · on May 19, 2023

It also disables their plugins (browsing, code interpreter, etc.)

tomashubelbauer · on May 19, 2023

This is so stupid. I just checked and yeah, that's why Browsing wasn't available for me! I wish their UI wasn't such a mess and it would have told me. I disabled history because I find it annoying that it sticks around and I don't care to use it. I didn't expect this would prevent me from seeing the Browsing plugin. Had I known I wouldn't have disabled it in the first place. Definitely a facepalm moment.

rmorey · on May 19, 2023

feels like kind of a fair tradeoff

behnamoh · on May 19, 2023

Isn’t this proof that Apple is NOT working on an equivalent ChatGPT? Else, they’d be using their own LLM instead of cgpt.

smith7018 · on May 19, 2023

If Apple is working on a ChatGPT equivalent, then they wouldn't release it internally to their 150,000 corporate employees first. Also, if they're working on a ChatGPT competitor, I highly doubt it would be near-ready considering how far behind they are in the generative AI space. The only reason Google released a competitor so quickly is because they were already working on it internally and were "forced" to share it publicly.

startupsfail · on May 19, 2023

They’ve had Siri for years now. Massive amounts of data, ability to deploy at scale, real concerns about privacy.

I’d expect that Siri group had grown substantially recently and is having a lot of fun, building the next generation.

spaceguillotine · on May 19, 2023

i sure hope so because Siri has gotten worse in the last 2 years, commands i used to use daily now fail saying it can't do that and then i try it on a different device, same prompt and it works. I do not understand why Siri works sometimes and just fails others. The HomePod Siri is especially bad now that I actually turned Siri off so my phone or watch would handle the request because they failed so often.

ben_w · on May 19, 2023

Something very weird is going on with the voice interfaces as I keep hearing this about each of the different assistants.

My experience is of Siri being, if anything, slightly better than a few years back; conversely, half the time Alexa would respond to me saying "Küche hundert prozent" with "Ich kann nicht Küche auf Spotify finden" and we don't even have Spotify.

pie420 · on May 19, 2023

This is by design. All the tech giants realized search-by-speech is mostly unprofitable compared to search-by-type because you can't really sell ads auraly compared to visually. Every search done by speech is lost ad revenue. Expect to see less and less search by voice over the coming years

IOT_Apprentice · on May 19, 2023

Unlikely. Apple management has been awful about this space in regards to Siri. Zero accountability at the Vp level. This should be a cakewalk for Apple, and yet here we are. Their implementation should be leveraging their hardware and Johny Srouji can implement enhancements to the neural engine & GPU to drive their LLM and establish a personal, privacy focused AI assistant.

orhmeh09 · on May 19, 2023

From a user's perspective, they haven't shown much evidence of building new, competitive capabilities for years. Sometimes it feels like Siri has actually become less useful.

ShadowBanThis01 · on May 19, 2023

People love to cite how "behind" Apple is in this space, but never say how. This is another variation of the oft-repeated mistake of lumping Apple in with "Big Tech" in all ways. Apple is not a gatekeeper to vast portions of the Internet the way Google and Amazon are. Nor is it obvious how Apple or its customers stand to benefit from generative "AI" at this point.

So, if anyone really thinks Apple is suffering from being "behind" in it, by all means expound on how. I'm genuinely interested.

derrasterpunkt · on May 19, 2023

The same could be true of Apple. But they aren’t „forced“ to release it. Apple wouldn’t release a competitor on the Web, they would build it into macOS/iOS/whateverOS. We will know when the WWDC starts.

FemmeAndroid · on May 19, 2023

I don’t know it is clear that this is proof. Apple is notorious for keeping people out of the loop late I to a product’s lifecycle.

That’s not to say I think they’re making an equivalent, just that I doubt many of the employees looking to use ChatGPT would have access to their internal LLM even if it was excellent.

It might be good evidence that they won’t be announcing one at WWDC in a couple weeks, since I can imagine they might roll something like that out internally a couple of weeks before launch, but I wouldn’t bet on it.

todd3834 · on May 19, 2023

When I worked at Apple they would have never rolled something out internally that they have yet to release. Sometimes they invite you to test something in a very controlled environment but they would not just give access outside of the controlled setting. Probably in person testing. Surrender your phone etc…

samstave · on May 19, 2023

I had friends who worked on the initial iphone design, supply chain, software etc..

We would go to lunch fairly frequently - and jeasus those guys were all extremely honorable of their NDA/secrecy - it was impressive.

One of the guys moved to google and worked on some of their first mainboard designs - which was secret at the timew, and I only found out about it when I went to meet him for lunch and I saw a mainboard under his desk and said "ooh whats that!" he freaked out and we got out of there quick....

Some of the hardware at FB in ~2012/13 they designed was really awesome....

I always wonder what happens to these closed HW systems as they uplift/replace them over time. I am sure they are destroyed. Which sucks for many reasons.

---

When we built Lucas' presidio campus and converged their DCs to that campus they through out tons of huge SGI boxes - and I was free to take some - but I didnt have any place to put it at the time - and I wish I would have figured that out more earnestly, as SGI always had beautiful cabinets.

whimsicalism · on May 19, 2023

It's not proof, but also I'm not sure where people are getting the idea that Apple is working on an LLM from.

They are not at all well positioned and have basically 0 expertise in this space.

larkost · on May 19, 2023

Apple has a large AI group, including both the Siri team and the one designing the Inference hardware in the A/M series processors (the "Neural Engine"). And you can bet that there are a lot of AI people working in the heavily rumored auto group. None of those are directly LLM directed, but I don't know why you think that apple does not have any experience in this space.

whimsicalism · on May 19, 2023

I work in this space (language & sequence modeling) and I also know people who work in the self-driving space.

They have little expertise in language specifically. Relative to almost all other tech companies, they have a very small amount of the DL practicioner share.

Self driving is a completely orthogonal problem and Apple is mostly relying on classical techniques there, not DL. Hardware groups are not modeling groups.

make3 · on May 19, 2023

as someone who worked at Apple, they have an incredibly strong culture of internal siloing. I would be zero surprised if they were building one but no one new about it

dgellow · on May 19, 2023

Note that you can disable ChatGPT training : https://help.openai.com/en/articles/7730893-data-controls-fa.... It’s a simple toggle in your settings.

> How does OpenAI use my personal data? Our large language models are trained on a broad corpus of text that includes publicly available content, licensed content, and content generated by human reviewers. We don’t use data for selling our services, advertising, or building profiles of people—we use data to make our models more helpful for people. ChatGPT, for instance, improves by further training on the conversations people have with it, unless you choose to disable training.

IAmNotACellist · on May 19, 2023

That'd fit in nicely with the other two things they shout: "AI IS EXTREMELY DANGEROUS AND THREATENS TO KILL US ALL." "But sign up here, guy, your $22/mo. means this AI is now safe and contained :)"

paulhodge · on May 19, 2023

> And then also say "By the way, we have private API access that doesn't use anything you say as training inputs- maybe your company would prefer that?"

They already say that anything submitted through the API is not used for training. https://openai.com/policies/api-data-usage-policies

nomel · on May 19, 2023

> we have private API access that doesn't use anything you say as training inputs- maybe your company would prefer that?"

This isn't enough for many companies, since the data still goes out the door. They would have to set up on site hosting to appease security minded orgs. Or, maybe that's what you mean.

yreg · on May 19, 2023

OpenAI already offers private ChatGPT instances hosted on Azure.

I know of a bank who is paranoid enough to use a self hosted on-premise GitHub instance and they went with the private (off-premise) ChatGPT instance.

They don't use it for code/confidential data though.

nomel · on May 19, 2023

> OpenAI already offers private ChatGPT instances hosted on Azure.

> They don't use it for code/confidential data though.

Yes, private isn't enough. They need to offer self hosted, for these types of clients. I imagine most orgs who need self hosted would already have a datacenter to run it in.

lisrjflksdjfi · on May 19, 2023

Oh that is pretty much their monetization strategy. They still on the hype step tho. As soon as they have enough mass, they will do exactly what you say.

For now they are just selling individually to CISO's, so they can pay a higher cost while looking savvy to the CEO.

quadcore · on May 19, 2023

It's so brilliant. As a layback person, I wonder how much more of transformers they went, like is it transformers that are brilliant or gpt (essentially)? Both i suppose but then is there a big distance between the two i wonder.

dr_dshiv · on May 19, 2023

Are there companies that offer web UI access to GPT4/Claude with privacy and security assurances that make it easy to provide LLM access to teams?

xwdv · on May 19, 2023

People will want to use it but few will want to pay for it.

OpenAI needs to find a way to turn users into the real product.

paxys · on May 19, 2023

ChatGPT is disallowed by default at pretty much every large company, just like every other external service that isn't explicitly approved and hasn't signed a contract. Apple employees aren't allowed to use their personal Dropbox account for storing corporate documents either, for example.

All such articles you see are just security teams clarifying the existing policy – you weren't allowed to use it before and you are not allowed to use it now. It's only noteworthy because it has ChatGPT in the title.

superfrank · on May 19, 2023

I work at a large publicly traded company and our org was told to treat it as if it were Stack Overflow. It's okay to use, but assume everything you give it will be seen by people who do not work for us and everything it gives you is potentially dangerous or incorrect.

thiht · on May 20, 2023

I completely agree with you, I don’t understand what makes this newsworthy.

I’d guess it’s because of journalists who have no idea about industry standards?

ziml77 · on May 19, 2023

For a company of Apple's size, banning ChatGPT entirely is probably the only effective way of preventing people from training ChatGPT on their internal data.

deet · on May 19, 2023

Apple's security policy already prohibited putting confidential or proprietary information into any not-explicitly-approved, externally-hosted service. So I'm guessing this wasn't a "ban" so much as reminding people of this policy and that ChatGPT is still unapproved.

The reasons go behind using data as training. Submissions to servers end up in logs, databases, temp files... who knows. And a company like Apple wants to not only ensure that the data is explicitly used by the receiving party, but also not inadvertently made accessible to others via poor security procedures, since their data is such a high value target.

pjc50 · on May 19, 2023

Indeed. I've also had a company-wide reminder that the "do not upload confidential data to external websites without authorization" also includes ChatGPT. Mind you, I don't think it would be any good at writing Verilog.

shusaku · on May 19, 2023

It seems like this is just a temporary issue, I imagine within a year a company like Apple can get company wide access to ChatGPT with suitable privacy provisions in place. For a hefty price of course.

kelipso · on May 19, 2023

Unless they move ChatGPT in house, privacy assurances are just words on a piece of paper. They would have to trust OpenAI, meaning they would have to trust Microsoft, with their internal data. Doubt they would do that instead of waiting for the hype to die down and download one of the many ChatGPT clones that'll be available within a year and run it in house.

shusaku · on May 19, 2023

> Unless they move ChatGPT in house, privacy assurances are just words on a piece of paper.

No, these are words that are enforced by law. As much as Microsoft might want to take a peak, it would be financially ruinous

kelipso · on May 19, 2023

Only if you get caught, and it's pretty hard to get caught when you peek at some logs in the servers located in the same room as you. Besides, leaks happen all the time and the fines aren't too bad either.

bun_at_work · on May 19, 2023

The real question is why would MS risk losing the business from Apple in such a context.

There are ways to set up incentives such that large corporations will follow the rules. There are myriad examples of this. But in most cases you can ask what is more profitable: Taking $1B+ a year from Apple for hosting an internal ChatGPT service from them, or maybe, if they're lucky, stealing trade secret that MS will never be able to compete against Apple with anyway?

This is all hypothetical; I just want to point out that these businesses are not just waiting to do evil things, they simply respond to incentives, like any business.

Firmwarrior · on May 19, 2023

There could be some very juicy promotions in it for middle managers who secretly help themselves to a peek at that data and use it to make decisions

That's been a major problem at Amazon where they launch competitors to their biggest and most successful marketplace sellers' cash cows

idoh · on May 19, 2023

I work in data privacy and grc. First, a random middle manager cannot simply peek at customer prod data, that’s seriously locked down. Second, attempting to do this would get you immediately terminated.

JohnFen · on May 19, 2023

They are just words on paper. They're words on paper that you can sue over if the promises are broken, of course, but that doesn't at all eliminate the fact that you have to trust the entity you're dealing with.

organsnyder · on May 19, 2023

But that's true of every vendor you work with, in just about any capacity. Even if you run your own on-prem infra, you're still trusting that your vendor-supplied hardware and software doesn't have any backdoors or other data leaks.

Businesses put a ton of trust in contracts—with hefty penalties for breaking them—all the time. Our entire economy is built on this.

JohnFen · on May 19, 2023

> But that's true of every vendor you work with, in just about any capacity.

Yes, that was part of my point.

Contracts are a mechanism that are heavily relied upon, no question about it. But their role is really more about making the terms of a deal crystal clear. In terms of legal enforceability, that is certainly a very important thing, but it's not like contract are a panacea. As the old saying goes, a contract is only as good as your ability to enforce it is, and lots of contract violations go unpunished simply because the one who was violated can't afford to sue.

blibble · on May 19, 2023

I can totally see Apple being content with sending all their internal plans and discussions to a company mostly funded and entirely hosted by Microsoft

layer8 · on May 19, 2023

They may not trust Microsoft to not eventually gain access to that data despite any assurances given.

jstarfish · on May 19, 2023

PSA: My own employer (not Apple) restricts the same, and is pushing employees to use an internal Azure GPT3.5 deployment instead.

Unlike OpenAI, we do not disclose to employees that all prompts are logged and forwarded to both the security and analytics teams. Everything is being logged and replicated in plaintext with no oversight.

So be careful about code snippets with embedded creds or asking EmployerGPT really stupid questions about how to do your job. The priests are recording your confessions so you never know how or if they'll get used against you later.

uguuo_o · on May 19, 2023

This is not surprising. My small company banned it weeks ago with the same reasoning. We simply do not want to potentially leak CUI/ITAR data.

neogodless · on May 19, 2023

controlled unclassified information (CUI)

international traffic and arms regulations (ITAR)

[0] https://cybersheath.com/resources/blog/what-to-know-about-it...

softwaredoug · on May 19, 2023

I feel like users need to be better educated and held accountable.

Would you post proprietary data on Stackoverflow? No. You would formulate a generic question with any IP removed. That’s how we should use public ChatGPT.

So I think there’s an argument for a monitored portal of ChatGPT usage, where you are audited and can get in trouble. Heck even an LLM system itself can help identify proprietary data! Then use that to educated people and hold them accountable.

jspdown · on May 19, 2023

I'm a bit surprised by the comments here. Looks like people really find LLM useful for their day to day work. I'm surprised because I (maybe natively) thought the level of hallucinations in these tool was too prohibitive to get real value.

I personally don't mind that Apple bans ChatGPT. The interesting stuff in this news to me is how many people seems to get real value from it. To the point where company invest into getting private instances/versions.

How do you uses these LLM, for what kind of task? Do you feel AI enhanced?

whimsicalism · on May 19, 2023

Have you actually sat down and used these models quite a bit?

I pay for GPT-4 and it is worth its weight in gold, I can engage in a conversation about technical topics.

Sure, they do "hallucinate" sometimes (so do my coworkers FWIW), but you can engage in a conversation about why it is wrong and they are quick to correct themselves (...this part can be less like certain coworkers).

Here's an example of a random question (relevant to my job) that I asked gpt-4

> In linear/logistic models, it is common to take the multiplicative interaction between two features which might be one hot encoded. For instance, A is a vector that one hot encodes your college and B is a vector that one hot encodes your employer, you would take the product len(A)*len(B) as the size of the new interaction vector. How do you make such cross/interaction terms in tensorflow?

laratied · on May 19, 2023

GPT-4 is less full of shit than most people I know and I don't think it is even close.

I have always found the knock that it confidently answers things it really doesn't know even if wrong quite hysterical. That exactly describes some of the smartest people I have met in my life.

SamPatt · on May 19, 2023

I'm surprised anyone doesn't get value from them. I query GPT-4 regularly, and it rarely disappoints.

It's decent at coding and excellent at troubleshooting.

I've never seen it make a grammatical, syntactic or spelling error. With good context, it can write anything you want.

You need to have experimented with it a bit to know how to reliably prompt it, but once you get a feel for it the tool is invaluable.

jspdown · on May 19, 2023

I didn't mean to be rude or anything like that. You are probably right, seeing all these positive opinions, there must be value in there.

I'm just amazed by the reactions on this post. From the HN crowd I was, wongly, expecting very different opinions.

For coding, are you using it like a pairing with a junior developer or more like a search engine for documentation?

SamPatt · on May 20, 2023

I use it with the documentation of what I'm building.

Paste in the relevant section of the developer docs, ask it to write a function, or ask it to produce a valid API request, etc.

I suppose it's like a junior dev in a sense but it's very fast, can correct its own mistakes if you point them out, and has an internet worth of knowledge about various approaches and troubleshooting techniques.

I find most people who don't see value haven't actually used GPT-4 yet. Have you?

mad0 · on May 19, 2023

I use LLM for generating data, for example to fill SQL tables for tests or some python datastructures. Transforming data works really well, so tasks like "take bibtex entry and convert it into citation"

I've tried to code entire IOS app with it but failed. You still need to have knowledge about what you are actually doing. I feel like it can accelerate learning, but in a very narrow way. When I was trying to do my own android app 3 years back I've learned bunch of different things about what I was trying to actually do and also around the whole ecosystem. When I was trying to code the IOS app I made progress fast, but I felt like I was just learning the thing that I wanted to learn not the actual surrounding knowledge. Now this sounds good, but I feel like I'm leaving something on the table when doing this kind of "accelerated learning".

I've also tried autogpt/langchain approaches to automate some writing commands for me like "remove all files that are beginning with abc". It did generate appropriate command but sometimes if they failed (the files weren't there in the first place) it just fell into the loop of constantly trying to get a successful deletion by refining the command. It even tried to go to the / to find all such files, talk about paperclips huh. Truth to be told I haven't played with the idea much so there is a room for improvement

LLMs kinda help, they are especially good when provided with source, but I don't really feel like a 1000x super-hacker with VR-glasses that spawns hundreds of agents each minute to do different task. (I would like to though)

I would like to do something with the LLMs but every idea I have already has an established startup or the argument "you could just write this in chatgpt" is made. Literally take whatever comes to your mind and add "LLM" or "GPT", there is a startup for that.

freedomben · on May 19, 2023

Lots of people use ChatGPT for things that hallucinations aren't an issue for. Drafting emails or articles or blog posts (which the human reviews and polishes) and other fun things like generating haikus are pretty common use cases.

Personally I also find the level of hallucinations (with GPT 4) to be minimal as long as it's a widely covered subject. The more technically niche, the worse the hallucinations get.

com2kid · on May 19, 2023

You can paste some gnarly code into gpt4, tell it "I think this code has an off by 1 error in it someplace" and GPT4 will find it for you.

So yes, it is useful.

It has helped me many times, "You've listed the items in this config file in the wrong order", saved me hours of debugging right there.

Even stupid stuff that I do once every couple years, I can ask GPT to setup my initial express server stuff "I want CORS enabled and these endpoints to process JSON" and it pops out code a few seconds later and saves me maybe 5 or 10 minutes of Googling.

I recently had GPT4 write me skeleton code for using Websockets, I'd never used them before and having something to base my work off of saved me a lot of time.

esotericsean · on May 19, 2023

The company I work for uses ChatGPT, StableDiffusion, ElevenLabs, and lots of other AI tools every day now. We make ads and use it to generate scripts, create "stock" photos, fake voice over, etc. We even have started training our own models to match branding guidelines and whatnot.

I also use ChatGPT and SD to help me with my personal hobby of game development. It's good at analyzing code snippets, refactoring, and SD is good for inspiration.

These are really useful tools and it's only the beginning.

kaliqt · on May 19, 2023

Many things. Writing tasks are aided but it helps a lot with coding imo. SO and the internet as a whole is more infuriating than helpful a lot of the time.

Also ChatGPT can do a lot of basic logic and repetitive tasks for me. It helps get things going and saves a ton of time and effort.

And that's not even mentioning GitHub Copilot.

ChatGTP · on May 19, 2023

Funny, I just cancelled my Copilot, I didn't really see the value in it.

It's fascinating how people see tools differently.

SamPatt · on May 20, 2023

Autocompleting my console.log statements with uncanny accuracy is already worth $10 a month to me.

It's hit or miss so I'm not surprised to hear from someone who sees more misses than hits. But when it works well I feel this rush of gratitude that I don't need to type everything out.

gumballindie · on May 19, 2023

Depends on the level of experience of the person using it or the given task. Less experienced coders find it useful because it saves time spent learning how to do things. Less demanding tasks are trivial if they involve basic coding. For others it doesnt help much.

whimsicalism · on May 19, 2023

Bah! Programming is life long learning, GPT-4 is not just useful for beginners.

gumballindie · on May 19, 2023

To be fair that’s true. If i switch from a language i have experience in to a language i am not familiar with then i find it useful. But i still dive into what its output does so i can actually learn the new language.

dmd · on May 19, 2023

I use it daily to write code for me, mostly of the "here is some input data, please use this API and this other API to transform it into this example output data". It works flawlessly 95% of the time.

wing-_-nuts · on May 19, 2023

I use it to search esoteric stuff about tools I don't use that much, and have horrible ux. What day is sunday to chron? How do I specify the first text field to 'cut'? How the heck do I do x thing in awk? gpt gives me an answer I can try and iterate on. I'm using it as a glorified search engine with an extremely clean and pleasant interface. I'm not asking it to write production code whole cloth.

TLDR: There's an old saying 'trust, but verify' and that applies to everything gpt gives you.

mjburgess · on May 19, 2023

The innovation with generative AI is fundamentally legal: they're copyright laundering systems. You put a few images (books, etc.) in, mix it around, and "poof" there disappears the copyright.

I think this largely accounts for when it will be useful. (And why companies will and will not use it).

542458 · on May 19, 2023

> I think this largely accounts for when it will be useful.

I think this is excessively cynical. For example, I just used chatGPT to write a job description last night. I'm not particularly interested in copyright laundering or anything of the sort - nobody really cares about copyright on job descriptions (they're mostly pretty similar) unless you're being egregious about it. I could spend a bit more time and just pattern match of other postings, but the LLM can do it better and faster than me. I just provided appropriate prompts for the job characteristics and benefits, and then provided the final editing and discretion.

I think there are lots of similar situations where we (as humans) need text that is essentially boilerplate, but still is expected to be well-enough-written. ChatGPT isn't a copyright evasion tool to me, it's a shortcut for better writing.

mjburgess · on May 19, 2023

For me, when I use the output I am acutely aware of how difficult it was to create the sources its sampling from.

Behind each line I can imagine a person solving a problem, for themselves and writing up their solution, and sharing it. This process is inordinately expensive for each individual: they must be competent with the ideas, techniques, etc. and deploy them in a novel circumstance.

LLMs need no competence with the ideas, indeed, need no ideas or techniques at all. The very same work can be obtain via two radically different algorithms: 1) develop conceptualisations and techniques to generate solutions in the face of novel problems; 2) sample from all such prior attempts. (2) requires many cases of (1) to work -- and that's the copyright laundering, or just, theft if you like.

Who in all their writing on the internet was consenting to train ChatGPT? No one.

This is the innovation. We have google search. We can easily get 99.5% of whatever chatgpt generates if we have the gaul to copy/paste it. Since we don't we launder these efforts through a novel interface.

It is an extremely useful tool, but we should be more aware of where it's utility comes from. From the last twenty years of human labour which has placed all our lives and thoughts on a digital commons which is here being replayed back to us as if "ChatGPT wrote it". AI here has written zilch. Absent all those digital efforts, this is a dumb system with nothing to say.

And if we had the audacity just to copy/paste from ebooks, github, blogs and the like -- we would barely benefit from it at all.

nologic01 · on May 19, 2023

Its funny, but even if the LLM was open source and free, it would still be unethical in terms of not attributing all this effort it is basically stealing.

So thats the first problem to solve. Its not an easy one because depending on the context the output it could be just generic, you can't attribute the entire universe, but in a specialized context it may well be literally impersonating some expert's output.

BoorishBears · on May 19, 2023

You just wrote and typed out two sentences and they were beamed across copper wires and glass tubes, and it was predicated on a mix of:

- concepts you've been taught across a lifetime - concepts you don't even know exist - hard labor and pain no one person can even track

An unspeakable mountain of effort from people you'll never know to enable a simple social interaction.

-

No one of us is an island. I think people underestimate how interconnected humanity is, and LLMs have been a mirror that forces them to understand just how little of us exists in isolation, and how much we ourselves are mirrors of knowledge and influences so far removed from us that we can't even acknowledge them if we try.

nologic01 · on May 20, 2023

> I think people underestimate how interconnected humanity is

That is a true fact (and underlies much social dysfunction) but you can't use it to justify a free for all.

Notice I mentioned two social inventions that individuals found important to keep count of the interdependency: money and copyright/attribution.

While faulty tools in many ways, abusing them is not going to lead to anything better. In fact if actors get away with it it would be a signal that power rests now with a new type of appropriating oligarchy.

BoorishBears · on May 19, 2023

> Behind each line I can imagine a person solving a problem, for themselves and writing up their solution, and sharing it. This process is inordinately expensive for each individual: they must be competent with the ideas, techniques, etc. and deploy them in a novel circumstance.

I'm really worried if you weren't doing this before LLMs. No person in the information era has ever had a thought that wasn't built upon an imaginable number of people solving problems for themselves, writing up their solutions, and sharing them.

Calling what an LLM is doing copy/pasting is like claiming you've only ever copy/pasted other people's thoughts.

justrealist · on May 19, 2023

> they're copyright laundering systems. You put a few images (books, etc.) in, mix it around, and "poof" there disappears the copyright

This also describes the process of going to university.

mjburgess · on May 19, 2023

> This also describes the process of going to university.

I do not recall the lesson wherein I sampled from thousands of prior exam papers to generate my answer to the one I sat.

Without any prior exam papers at all I could, indeed, sit the exam.

The algorithm which animals follow to be able to do such great things is very expensive: acuring actual skill, technique, knowledge and competence with the world.

The algorithm statistical AI follows is simple: hoover up all digitised human history and sample from it.

This is entirely dependent on humans not doing that.

We have people doing that all the time: writing books, blogs, videos, github, etc. 99.5% of anything we want chatgpt for exists -- we just don't have the audacity to steal it.

Now we can!

justrealist · on May 19, 2023

If you think ChatGPT is dependent on exam training data to answer exam questions, you have a misunderstanding of how LLMs work.

mjburgess · on May 19, 2023

You'll have to show me a single case of an LLM being trained on data which excludes exams and continues to answer exam questions.

You'll note that ChatGPT performance drops off a cliff on many exams released after 2021. It's a trick.

BoorishBears · on May 19, 2023

An LLM only trained on exam papers wouldn't even be an LLM.

You said above:

> I do not recall the lesson wherein I sampled from thousands of prior exam papers to generate my answer to the one I sat.

No. You instead sampled from thousands of prior conversations, textbooks, story books, movies, songs... to form how you write, how you think, how you understand.

What ever you studied specifically for that exam wasn't even the tip of the iceberg of what was required to write your answers, even understanding that what you studied was predicated on the efforts of others in a way that makes your weeks of cramming infinitesimally small in the scale of effort that goes into writing a single answer on that test.

mjburgess · on May 19, 2023

You will find no such things in the body or the brain

this dumb 'statisticalism' is false-- and a product just of the impulse for engineers to misexplain reality on the basis of their latest engineering successes

animals grow representations of their environments which are not summarises of cases --- they're sensory motor adaptions which enable coordination of the body and mind in every sense

BoorishBears · on May 19, 2023

You're responding to a comment no one wrote, the point wasn't to state about how those representations are formed: it was that you integrate many many environments to do anything useful. If someone were kept isolated enough from birth, regardless of inate intelligence they wouldn't be able to take that exam even if we gave them a year and make it open book: Genie took half a decade to reach a 2 year old's language skills.

But there's also a certain irony in your dig on engineers mis-explaining while trying to paint the links between "sensory motor adaptations [sic]" and statisticalism as a response to recent engineering successes...

https://pubmed.ncbi.nlm.nih.gov/16807063/

whimsicalism · on May 19, 2023

All of the GPT-4 exams it were evaluated on were not present in its training data.

But honestly, if you don't get it now, I can't hope to convince.

mjburgess · on May 19, 2023

It's always sampling from an existing distribution of relevant data points -- that's necessarily how its working.

If you want to claim the sample set is only mildy similar to exam questions -- so be it, that may be true. Or if you want to claim that its sampling method is attentive to structural associations in its sample set, so that its not lifting from "identical distributions" -- so be it.

So long as those "structural associations" are givens, and the data "givens", the process is just sampling from a domain of human effort without expending any of a similar kind.

If there had been no internet, ChatGPT would be a dumb mute -- because it has no capacity to generate data; it does not develop actual conceptualisations of the world -- it samples from the data shadows created by people.

To produce useful data requires expounding a tremendous effort -- growing an animal to cope with the world. It is this which is being laundered, unpaid and unacknowledged, through LLMs.

Whilst star-trek-huffing loons claim this stuff is doing the opposite -- a ideological delusion which benefits all those whose bank accounts are increased by the lie that "ChatGPT wrote this".

If we were prepared to price the data commons which has been created over the last 20-30 years of the internet, by everyone, its not hard to think training ChatGPT would cost a trillion.

How much labour went into creating that digital resource, and by how many, etc.?

whimsicalism · on May 19, 2023

> It's always sampling from an existing distribution of relevant data points -- that's necessarily how its working.

I work in this field and 'sampling from an existing distribution of relevant data points' is just wrong and you have no way to say that is 'necessarily how its working' apriori in a world where implicit regularization exists.

Not going to engage with the labor-theory of value bit because I think it is not particularly relevant to the disagreement I raised and not one with a 'right' answer.

mjburgess · on May 19, 2023

lol, i am not making an argument premised on the labour theory of value -- chatgpt is a proof against this theory : $20/mo for the labour of two generations

it is the perogative of states to make redress when labour is severely underpriced due to the falsity of this theory of value -- and had they , chatgpt would be exposed for what it is

regularisation is such a horrifyingly revealing term

reality isn't a regularisation of its measures -- the meaning of words is not a regularisation of their structure

such obscene statistical terminology should be obviously disqualifying here

our knowledge of the world isn't a statical regularisation of associations

that very framing exposes how deficient this line is

animals grow representations --- they do not regularise text token patterns

one is possible /only because/ of the former

visarga · on May 19, 2023

Why not? This would fix the regurgitation issue and possibly be more palatable to copyright holders. The LLM trained on both copyrighted and open works can generate training data for a second LLM that never sees copyrighted works, achieving perfect separation of idea from expression.

Copyright is supposed to protect expression but not ideas. Ideas are free, LLMs should be allowed to learn ideas. This can also filter out bad quality data and PII, or expand on parts of the distribution with less samples. It would be "dataset engineering" as opposed to "model engineering".

A recent paper (TinyStories) shows you can train a LM entirely with synthetic data and even evaluate it with another LLM. They could make a model 1000x smaller that still has fluent English - child level English more specifically, but well articulated, and even capable of reasoning.

https://arxiv.org/abs/2305.07759

I believe in the future LLMs will train on 100% synthetic data from previous generations of LLM. The benefit will be increased quality, more privacy, less copyright infringement and smaller more efficient models, like TinyStories. Only the first few generations of LLM need to scrape the whole internet, the next generations will have better data.

aeturnum · on May 19, 2023

I see this comment all the time in this conversation and it strikes me as being in very bad faith. I would love to understand what you're trying to get at.

Because - first, in a legal sense, it seems wrong? You cannot memorize a book and then reproduce it from memory to disappear the copyright. The law on AI & copyright is very young (and I am not an expert), but artistic expression cases that involve copying (i.e. warhol) have rested on the idea that the copying is transformative in some way - but this seems in direct opposition to the idea that machines cannot create works themselves. I.e. the legal understanding that protects AI from needing to respect copyright also suggests it should not be understood to be traditionally transformative.

On a less legal level, an AI ingesting a huge amount of content and adjusting a neural network is very different from a student going to university in most ways. There are also similarities! But in a general way the direct comparison does not line up particularly well. So many complaints about Transformer-based AI exist on levels outside of the precise "agent doing the learning" (including this one).

croes · on May 19, 2023

The difference is the ease of use and the mass of output.

jillesvangurp · on May 19, 2023

I think it's the opposite. There are of course companies that are being more conservative than others. And there are companies that are going to be more successful than others using this stuff. Because those companies may end up with an edge over those other companies. It's that simple. You can hope for courts to put a stop to that but you'd be a fool to expect them to lift a finger any time soon. And can you afford them taking a different decision?

Probably Apple is right in not being overly worried about losing an edge here short term. But that would be because they are quite comfortably leading way ahead of the pack. They can adjust as needed and cost is not a factor when you ship products with such fat margins as they do. In other words, they can afford to be conservative here and see how things play out and adjust later. A few thousand people in their team being inconvenienced by having to do things manually without AI assistance doesn't really add up to a whole lot of cost for them.

That's not true for a lot of companies. Other companies might have to make different choices in order to stay relevant. In the end it's a choice between artificial intelligence or self-imposed stupidity. Not making proper use of resources available as a company can be a fatal mistake. And things like chat gpt are here right now and usable for anyone who cares to use it. And with some clear added value.

As for the legalities. I wouldn't get my hopes up for judges to apply interpretations of copyright law that are inconsistent with the past ones. Copyright is not going anywhere. And forget about politicians adding much to the law that is going to matter any time soon.

It boils down to concepts like fair use, people actually starting court cases, etc. Lots of people talking about that but not a lot with the deep pockets to follow that through. Not a lot of coherent lobby activity on this front to make politicians move. Lots of companies with deep pockets and a vested interest in this stuff and a lot of lobbying power to ensure the gravy train doesn't grind to a halt.

Apple will eventually integrate AI into their business. It might not be the MS/OpenAI flavour. Or the Google one. Why would they want to depend on that? It's going to be a business decision for them.

soulofmischief · on May 19, 2023

If the output is fundamentally different from the input, or reasonably derivative, then I agree that copyright should disappear

easyThrowaway · on May 19, 2023

If you asked any of the CEOs from those companies banning ChatGPT we're in for a glorious AI-shaped future, until it comes to them. It doesn't really inspire any confidence.

Can you imagine if Marlboro or Philip Morris forbade their employees from smoking for health reasons?

sebzim4500 · on May 19, 2023

A lot of those companies have already set up their own instance of OpenAI's models running on Azure. I'm not involved with this directly, but it is my understanding that Microsoft are selling (and pricing) this aggressively.

mk67 · on May 19, 2023

Yep, with proper enterprise agreements in place. Using this as of now.

croes · on May 19, 2023

The problem is, it's not their instance, it's Microsoft's and nobody knows what MS does with the data.

sebzim4500 · on May 19, 2023

Sure, but you can say the same thing about all of their cloud products and it doesn't seem to hold them back.

croes · on May 19, 2023

But their other products don't have a orompt do ask about that data

sebzim4500 · on May 20, 2023

So what? How do you know that ChatGPT isn't trained on the documents you have stored in onedrive?

maxfurman · on May 19, 2023

Perhaps it is more like Marlboro forcing its employees to smoke Marlboros - if those Marlboros were secret, non-public cigarettes only available on Marlboro premises. OK so maybe this metaphor is a little thin.

GaggiX · on May 19, 2023

I see it more as "don't smoke in our facilities" since you cannot use ChatGPT when writing Apple's internal code, but you can use it on your own.

easyThrowaway · on May 19, 2023

I'm fully aware my initial metaphor was a bit forced and it doesn't quite match 1:1 to the original point, but using AIs while at your workplace is the ideal usage they're trying to sell here.

It still gives me a "Do as I say, not as I do" vibe, and reinforces the idea that AI tech is being used merely as a trojan horse to capture and monopolize previously inaccessible markets for the companies building and training the models.

GaggiX · on May 19, 2023

Apple probably will switch to a self-hosted solution.

mola · on May 19, 2023

What about all of us plebs who work at small companies not able to self host these models viably? This is the point OP was about. They want us to use their model at our jobs, while they are the first to disallow it at their offices.

GaggiX · on May 19, 2023

I mean ChatGPT is not a creation of Apple so it would be better if you report an example, also a GPU is not that expensive and you can run on it LLM like StarCoder or more general ones like Vicuna for a small team of people with ease.

digging · on May 19, 2023

> Can you imagine if Marlboro or Philip Morris forbade their employees from smoking for health reasons?

Or if Steve Jobs disallowed his children from using iPads?

croes · on May 19, 2023

They will happily use the same technology to get their employees knowledge without the need to pay them any longer.

Qweiuu · on May 19, 2023

My company is huge 100k and lucky enough they also see it a s critical and make it centrally available to us soon.

But this will be a problem for big companies: small ones normally care less about these types of things. This means big companies have to do something otherwise they will compete with ml enhanced developers.

blibble · on May 19, 2023

> This means big companies have to do something otherwise they will compete with ml enhanced developers.

I'm sure they're terrified of competing with the legions of boilerplate generators

Qweiuu · on May 19, 2023

I also used it for rubber duck debugging/thinking.

I'm pretty sure the current level is not the ceiling of it.

And for the first time ever it makes sense for a large company to put knowledge on purpose into a LLM.

People leave companies but that knowledge is even more critical for big projects and the loss of people as well.

illuminati1911 · on May 19, 2023

No one said anything about creating boilerplates.

It's a significant advantage for developers to ask AI to solve technical problems, get right answers right away and move to a next task vs keep on scratching your head for the next 5h wondering "why it doesn't work".

noisy_boy · on May 19, 2023

> get right answers right away

I think equal advantage lies in getting multiple approaches fleshed out, even if the answer isn't right, as in, may not compile as-is. That is more than sufficient advantage a developer gets because most of the time usually spent isn't actually typing the code but finding out how to design/combine things.

E.g. I was working on a rust problem and asked ChatGPT for a solution. What it provided didn't compile (incorrect functions etc) but it provided me information in terms of the crates to use, general outline of the functionality and the approach to combine them - that proved to be more than enough for me to get going (to be clear, I didn't blindly copy the code; I understood it first and wrote tests for the finished product). I think that is where the real advantage lies. I see it as an imperfect but very powerful assistant.

grumple · on May 19, 2023

Have you ever tried to solve a difficult technical problem with ChatGPT (any version)? I have. It worked about as well as Google search. Which is to say, I had to keep refining my ask after every answer it gave failed until it ultimately told me my task was impossible. Which isn't strictly true, as I could contribute changes to the libraries that weren't able to solve my problems.

Funny enough, the answers gpt4 gave were basically taken wholesale from the first Google result from stackoverflow each time. It's like the return of the I'm Feeling Lucky button.

nmlrc · on May 19, 2023

There are many developers who are unable to do that and need to be spoon-fed. This is the market for ChatGPT and that's why they heavily promote it.

I doubt though that corporations that employ these developers will have any advantage. To the contrary, their code bases will suffer and secrets will leak.

wiseowise · on May 19, 2023

ChatGPT is a godsend since Google search turned to shit.

Ask some question about cpp or c and Google search will provide 5 or 6 ad ridden cesspools before giving link to cppreference.

digging · on May 19, 2023

This is very much my experience too. Occasionally ChatGPT can give me something quickly that I wasn't able to find quickly (because I was looking in the wrong place, likely). But most of the time, it's just a more interactive (and excessively verbose) web search. In fact, search tends to be more optimized for code problems; I can scan results and SO answers much faster than I can scan a generated LLM answer.

throwuwu · on May 19, 2023

Use the edit button if you get an incorrect answer. The whole conversation is fed back in as part of the prompt so leaving incorrect text in there will influence all future answers. Editing creates a new branch allowing you to refine your prompt without using up valuable context space on garbage.

grumple · on May 19, 2023

That doesn't change anything in terms of the flow. It's still refining the input over and over. This is exactly how searching on google works as well.

For the case I last tested, there was no correct answer. I asked it to do something that is not currently possible within the programming framework I asked it to use. Many people had tried to solve the problem, so chaptgpt followed the same path as that's what was in its data set and provided solutions that did not actually solve the problem. There wasn't any problem with the prompts, it's the answers that were incorrect. Having those initial prompts influence the results was desired (and usually is, imo).

JohnFen · on May 19, 2023

Maybe?

I haven't actually seen that advantage in action. That is, I haven't seen a case where an LLM has actually given a solution right away for a problem that would have stumped a dev for multiple hours.

In my workplace, two devs are using chatgpt -- and so far, neither has exhibited an increase in productivity or code quality.

That's a sample size of two, of course, so statistically meaningless. But given the hype, I expected to see something.

blibble · on May 19, 2023

> It's a significant advantage for developers to ask AI to solve technical problems

I can ask my magic 8 ball too, it has about the same success rate

hu3 · on May 19, 2023

That's a state of the art magic 8 ball if I ever saw one.

Perhaps if you provide an API to it you'll secure billions in funding in no time.

blibble · on May 19, 2023

> Perhaps if you provide an API to it you'll secure billions in funding in no time.

that's not a bad idea actually

I just need a way to market it as some sort of AI based decision maker

"extreme temperature generative AI" maybe

EnragedParrot · on May 19, 2023

ChatGPT is a godsend for junior developers, it isn't very great at providing coherent answers to more complex or codebase specific questions. Maybe that will change with time but right now it's mostly useful as a learning aid.

throwuwu · on May 19, 2023

For complex codebases it’s better to use copilot since they do the work of providing context to gpt for you. CopilotX will do a lot more but it’s still waitlist signup. You could hack something together yourself using the API. The quickest option is just to paste the relevant code in to the chat along with your prompt.

EnragedParrot · on May 19, 2023

I tend to use both. Copilot is vastly better for helping scaffold out code and saves me time as a fancy autocomplete, while I use ChatGPT as a "living" rubber duck debugger. But I find that ChatGPT isn't good at debugging anything that isn't a common issue that you can find answers for by Googling (it's usually faster and more tailored to my specific situation, though). That's why I think it's mostly beneficial in that way to junior devs. More experienced devs are going to find that they can't get good answers to their issues and they just aren't running into the stuff ChatGPT is good at resolving because they already know how to avoid it in the first place.

jerf · on May 19, 2023

And this gets into why companies are banning it, at least for the time being; developers and especially junior developers in general think nothing of uploading the sum total contents of the internal code base to the AI if it gets them a better answer. It isn't just what it can do right this very second that has companies worried.

It isn't even about the AI itself; the problem is uploading your code base or whatever other IP anywhere not approved. If mere corporate policy seems like a fuddy-duddy reason to be concerned, there's a lot of regulations in a lot of various places too, and up to this point while employees had to be educated to some extent, there wasn't this attractive nuisance sitting out there on the internet asking to be fed swathes of data with the promise of making your job easier, so it was generally not an issue. Now there is this text box just begging to be loaded with customer medical data, or your internal finance reports, or random data that happen to have information the GDPR requires special treatment for even if that wasn't what the employee "meant" to use it for. You can break a lot of laws very quickly with this textbox.

(I mean, when it comes down to it, the companies have every motivation for you to go ahead and proactively do all the work to figure out how to replace your job with ChatGPT or a similar technology. They're not banning it out of fear or something.)

h4h5h6 · on May 19, 2023

I doubt we'd have ChatGPT in the first place if every developer had this attitude

layer8 · on May 19, 2023

It depends heavily on what you do. When working with proprietary/non-public software stacks, or anything requiring knowledge of your internal codebase, ChatGPT is of little help.

ReptileMan · on May 19, 2023

Get answers right away. For correct answers you need right questiob and a bit of a star alignment